Combinatorics

December 25, 2024

1 Misc
1.1 Perfect cover of chessboard
1.2 Magic square (of odd order)
2 Nim Game
2.1 Restricted version
2.2 Sprouts
3 Counting Principles
3.1 Basic Counting Principle
3.2 Pigeonhole Principle
3.3 Permutation
3.3.1 ○permutation of multisets
3.3.2 generating permutations
3.4 Combination
3.4.1 distinguishable boxes
3.4.2 ○combination with repetition / combinations of multisets
3.4.3 ○generating combinations
3.5 Binomial theorem and multinomial theorem
3.5.1 Generalized
3.5.2 Unimodality
3.6 Common Counting Methods
3.7 Other Properties of Permutations
3.7.1 ○inversions of permutations
3.7.2 ○cycles of permutations
4 Counting Applications
4.1 ○Partially ordered sets
4.1.1 Sperner’s Theorem
4.1.2 ○Dilworth Theorem
4.2 Graph
4.3 Ramsey Theory
4.3.1 definitions
4.3.2 properties
4.3.3 TODO proof of the theorem
5 Inclusion-exclusion principle
5.1 weighted version
5.2 derangements
5.2.1 [TODO: circular non-consecutive permutations]
5.3 permutation with forbidden positions
5.4 [TODO: nonconsecutive permutations]
5.5 Euler totient function
5.6 rook polynomials
6 ○Möbius inversion
6.1 Möbius inversion
6.2 product posets
6.3 ○[TODO: number theoretical]
6.4 ○circular permutations with repetition
7 Linear Recurrence
7.1 ○structural theorems
7.2 constant coefficients
7.2.1 homogeneous
7.2.2 nonhomogeneous
8 ○Generating Functions
8.1 ordinary generating function
8.2 exponential generating function
8.3 connection with linear recurrence
8.4 Catalan Sequence
9 Special Counting Sequences
9.1 ○Catalan Sequence
9.1.1 [TODO: A geometric example]
9.2 Difference sequences
9.3 ○Stirling numbers
9.3.1 of the second kind
9.3.2 of the first kind
10 Graph-related Problems
10.1 Matching
10.2 Vertex Covering
10.3 ○Systems of distinct representatives
10.4 Stable Marriage

1 Misc

1.1 Perfect cover of chessboard

Definition 1.1 (b-omino). A $𝑏$ -omino is a $(1, 𝑏)$ -board or a $(𝑏, 1)$ -board, where $𝑏 \geq 2$ .

Definition 1.2 (Perfect cover). A perfect cover of an $𝑚$ -by- $𝑛$ chessboard by $𝑏$ -ominoes is an arrangement of $𝑏$ -ominoes on the chessboard so that no two $𝑏$ -ominoes overlap.

Theorem 1.1. An $𝑚$ -by- $𝑛$ chessboard has a perfect cover by $𝑏$ -ominoes if and only if $𝑏$ divides either $𝑚$ or $𝑛$ .

Proof. $\Leftarrow$ side is obvious: Place all dominoes vertically or horizontally.

$⟹$ side: Consider the cyclic coloring (generalization of the alternate black-white coloring):

\begin{matrix} + 1 \to \\ - 1 ↓ & [\begin{matrix} 1 & 2 & \dots & 𝑏 & 1 & \dots \\ 𝑏 & 1 & \dots & 𝑏 - 1 & 𝑏 & \dots \\ 𝑏 - 1 & 𝑏 & \dots & 𝑏 - 2 & 𝑏 - 1 & \dots \\ ⋮ & ⋮ & ⋮ & ⋮ \\ 2 & 3 & \dots & 1 & 2 & \dots \\ 1 & 2 & \dots & 𝑏 & 1 & \dots \\ ⋮ & ⋮ & ⋮ & ⋮ \end{matrix}] \end{matrix} .

It follows that, no matter how the $𝑏$ -ominos are placed, each of them will over exactly one square of each of the $𝑏$ colors.

We color the $𝑚 \times 𝑛$ -board in this way. Suppose $𝑚 = 𝑝𝑏 + 𝑟, 0 \leq 𝑟 < 𝑏$ and $𝑛 = 𝑞𝑏 + 𝑠, 0 \leq 𝑠 < 𝑏$ . Then, we divide the chessboard as

[\begin{matrix} 𝑝𝑏 & \times & 𝑛 \\ 𝑟 \times 𝑞𝑏 & | & 𝑟 \times 𝑠 \end{matrix}] .

Apparently, the $𝑝𝑏 \times 𝑛$ and $𝑟 \times 𝑞𝑏$ parts have the same number of squares for each color. If the whole board has a perfect covering, then the remaining $𝑟 \times 𝑠$ part must also have the same number of squares for each color. WLOG, we can assume that $𝑟 < 𝑠$ (if not, switch the two dimensions of the board). Then, the top left corner of the $𝑟 \times 𝑠$ part must be $1$ . By the nature of the cyclic coloring, $1$ occurs $𝑟$ times (look at the diagonal) in this part, thus, so do other colors. That is, $𝑟𝑏 = 𝑟𝑠$ . We can infer from this that $𝑟 = 0$ . □

1.2 Magic square (of odd order)

Definition 1.3 (magic square). A magic square of order $𝑛$ is an $𝑛 \times 𝑛$ -array constructed out of the integers $1, 2, \dots, 𝑛^{2}$ in such a way that the sum of each row, each column, and each diagonal is the same. The sum is called the magic sum of the magic square.

Method 1.1 (De La Loubère’s method). This is only for magic square of odd order. We consider a double infinite chessboard with the $(𝑖, 𝑗)$ -th entry denoted by $𝑎_{𝑖, 𝑗}$ , where $𝑎_{𝑖 + 𝑛, 𝑗} = 𝑎_{𝑖, 𝑗 + 𝑛} = 𝑎_{𝑖, 𝑗}$ .

Steps:

1.: $𝑎_{1, (𝑛 + 1) / 2} = 1$ .
2.: $𝑎_{𝑖 - 1, 𝑗 + 1} : = 𝑎_{𝑖, 𝑗} + 1$ when $𝑛 ∤ 𝑎_{𝑖, 𝑗}$
3.: $𝑎_{𝑖 + 1, 𝑗} : = 𝑎_{𝑖, 𝑗} + 1$ when $𝑛 | 𝑎_{𝑖, 𝑗}$ and $𝑎_{𝑖, 𝑗} < 𝑛^{2}$ .

2 Nim Game

Definition 2.1 (Nim Game). Suppose there are 2 players and $𝑘$ heap containing $𝑛_{1}, \dots, 𝑛_{𝑘}$ coins, respectively.

Rules:

1.: The two players alternate turns. Player I plays first.
2.: Each player, when it is their turn, selects one of the heaps and removes one or more coins from it.

The game ends when all the heaps are empty. The player who takes the last coin wins.

Definition 2.2. Writing the sizes of heaps as binary numbers. A nim game is called balanced if the sums of bits are even on all positions.

Proposition 2.1 (From balanced to unbalanced). The removal of any amount of coins from any heap breaks the balance.

Proof. In binary representation, this is just subtracting multiple 1s, which turns some but necessarily all even positions to odd. This is because some position with sum 0 will borrow a 1 from the next position. A concrete example: two piles 110 and 110. Subtract 011 from the first pile, which becomes 0 1 1. The sums become 1 2 1 from 2 2 0. □

Proposition 2.2 (From Unbalanced to Balanced). A player can always turn an unbalanced state to a balanced one.

Proof. Suppose we have $𝑘$ heaps and at least one of them is non-empty.

1.: Find the odd bits. We need to choose a heap and flip its bits on these positions.
2.: Notice that there might be some odd position where the bit of this heap is 0. Flipping 0 to 1 is not trivially doable (will increase the size of bit).
3.: For every of such positions, we need to “borrow” some 1-bit on higher (or equal) positions, i.e., flipping that bit to 0 so that we have the budget to flip this bit to 1.
4.: Only need one higher (or equal) 1-bit.
5.: Can just choose the heap with 1 on the highest odd position.

□

2.1 Restricted version

Suppose we add the restriction that each player can move only $𝑝 \in {1, 2, \dots, 𝑞}$ coins each time.

2.2 Sprouts

3 Counting Principles

3.1 Basic Counting Principle

Definition 3.1 (Product Rule).

Suppose a procedure can be broken down into $𝑚 > 1$ tasks $𝑇_{1}, 𝑇_{2}, \dots, 𝑇_{𝑚}$ , and carried out by performing the tasks in sequence.
If each task $𝑇_{𝑖}$ can be done in $𝑛_{𝑖}$ ways regardless of how the previous tasks were done, then there are total $\prod_{𝑖 = 1}^{𝑚} 𝑛_{𝑖}$ ways to carry out the procedure

e.g. There are three cities A, B and C. Suppose you start from A, pass through B and end at C. There are $3$ ways from A to B, $2$ ways from B to C. Then the number of total ways from A to B to C is $3 \times 2 = 6$ .

Definition 3.2 (Sum Rule).

Suppose a procedure can be carried out in one of $𝑚 > 1$ methods $𝑇_{1}, 𝑇_{2}, \dots, 𝑇_{𝑚}$ .
If each method $𝑇_{𝑖}$ can be down in $𝑛_{𝑖}$ ways, then there are total $\sum_{𝑖 = 1}^{𝑚} 𝑛_{𝑖}$ ways to carry out the procedure.

e.g. There are $3$ method to transport from A to B, by train, by car and by plane. These $3$ method have $1, 2, 3$ choices, respectively. Then there are total $1 + 2 + 3 = 6$ ways.

3.2 Pigeonhole Principle

Theorem 3.1 (Pigeonhole Principle). Different descriptions:

If $𝑘 + 1$ or more objects are placed into $𝑘$ boxes where $𝑘$ is a positive integer, then there is at least one box containing two or more objects.
(I prefer this) Let $𝑥_{1}, \dots, 𝑥_{𝑘}$ be non-negative integer variables. Then, for every solution to the equation $𝑥_{1} + \dots + 𝑥_{𝑘} = 𝑘$ , there exists an $𝑖$ such that $𝑥_{𝑖} \geq 2$ .

Proof: by contradiction

Example 3.1. $25$ points are marked inside the area bounded by a rectangle that is $6$ cm long and $4$ cm wide. Show that there are at least $2$ points that are at most $\sqrt{2}$ cm apart.

Proof: Divide the area into $24$ squares of $1 \times 1$ cm. By the pigeonhole Principle, at least $2$ points will be in the same square, The biggest distance between a square is $\sqrt{2}$ cm.

Theorem 3.2 (Pigeonhole Principle, Strong form). Different descriptions:

Let $𝑞_{1}, \dots, 𝑞_{𝑘}$ be positive integers. If $(𝑞_{1} + \dots + 𝑞_{𝑘} - 𝑘 + 1)$ objects are put into $𝑘$ boxes, then the 1st box contains at least $𝑞_{1}$ objects, or the 2nd box contains at least $𝑞_{2}$ objects, …, or the $𝑘$ th box contains at least $𝑞_{𝑘}$ objects.
(I prefer this) Let $𝑥_{1}, \dots, 𝑥_{𝑘}$ be non-negative integer variables. Then, for every solution to the equation $𝑥_{1} + \dots + 𝑥_{𝑘} = 𝑞_{1} + \dots + 𝑞_{𝑘} - 𝑘 + 1$ , there exists an $𝑖$ such that $𝑥_{𝑖} \geq 𝑞_{𝑖}$ .

Proof: by contradiction.

Remark 1. Why I prefer the second description? The first description implies that (1) the objects are identical, (2) boxes have no size limits, and (3) the objects are distributed into boxes uniformly at random. While this scenario is valid, many applications of the Pigeonhole principle involve more complex conditions and relations between objects and boxes. Relying solely on the first description may make it unobvious or difficult to apply the theorem.

See

Example 3.2 for a case where boxes have constraints.
Example 3.3 and 3.4 for cases where only partial information of the problem is used.

It is common to these examples that they cannot be simply modeled by randomly putting identical objects into boxes.

Corollary 3.1. When $𝑞_{1} = \dots = 𝑞_{𝑘} = 𝑞$ , we have the following: If $𝑘 (𝑞 - 1) + 1$ objects are placed into $𝑘$ boxes, then there is at least one box containing $𝑞$ or more objects.

Example 3.2. If we draw $𝑛 + 1$ distinct integers from ${1, \dots, 2 𝑛}$ , then there are always two which differ by 1.

Let $𝑥_{2 𝑖 - 1, 2 𝑖}$ be a non-negative integer variable representing the number of integers drawn from $2 𝑖 - 1$ and $2 𝑖$ . We have $𝑥_{2 𝑖 - 1, 2 𝑖} \leq 2$ , but this upper bound does not affect us applying PP to derive $\exists 𝑖, 𝑥_{2 𝑖 - 1, 2 𝑖} \geq 2$ .

Example 3.3. In a room there are 10 people. Their ages are given in whole numbers only and are between 1 and 60 (both included). Then, one can always find two disjoint non-empty groups of people (no need to cover all people) the sum of whose ages is the same. Can 10 be replaced by a smaller number.

(An observation irrelevant to PP) It suffices to find two different (but possibly overlapping) subsets of people with the same sum of ages. This is because we can take the intersection out of the two subsets to form two disjoint groups satisfying the condition (think of why).

For 10 people, there are $2^{10} = 1024$ subsets. The possible sum of ages ranges from 1 to 600. Let $𝑥_{𝑖}$ be a non-negative integer variable representing the number of subsets with the sum equal to $𝑖$ . Then, $𝑥_{1} + \dots + 𝑥_{600} = 1024 ⟹ \exists 𝑖, 𝑥_{𝑖} \geq 2$ . Note that we allow some redundancy here. E.g., the empty subset and the entire subset.

The result also holds for 9 people, though not that trivially provable because $2^{9} \leq 60 \cdot 9$ . We can prune out the easy case where there exists two people with the same age. Then, the sum of age can be at most $60 + 59 + \dots + 52 = 504$ .

Actually, the results still hold for 8 people but not 7 people. It’s no longer easy to use PP for the proof, which is reasonable because we only apply PP to partial information.

Example 3.4. Consider an undirected graph with 100 nodes. Each node has an even degree. Then, there must be three nodes with the same degree.

Let $𝑥_{𝑖}$ be a non-negative integer variable representing the number of nodes with degree $𝑖$ , for $𝑖 = 0, 2, \dots, 98$ . Then, a necessary but not sufficient condition for the problem is that $𝑥_{0} + 𝑥_{2} + \dots + 𝑥_{98} = 100$ . With this we can only infer that $\exists 𝑖, 𝑥_{𝑖} \geq 2$ . To approach the intended result, we have to prune the case a bit:

If $𝑥_{0} = 0$ (no node with degree 0), by PP, $𝑥_{2} + \dots + 𝑥_{98} = 100 ⟹ \exists 𝑖 \geq 2, 𝑥_{𝑖} \geq 3$ .
If $𝑥_{0} = 1$ (1 node with degree 0), by PP, $𝑥_{2} + \dots + 𝑥_{98} = 99 ⟹ \exists 𝑖 \geq 2, 𝑥_{𝑖} \geq 3$ .
If $𝑥_{0} = 3$ , done.
If $𝑥_{0} = 2$ , then $𝑥_{98} = 0$ , by PP, $𝑥_{2} + \dots + 𝑥_{96} = 98 ⟹ \exists 2 \leq 𝑖 \leq 96, 𝑥_{𝑖} \geq 3$ .

3.3 Permutation

Definition 3.3 (Permutation). Let $𝑛, 𝑘$ be two integers with $0 \leq 𝑘 \leq 𝑛$ . A $𝑘$ -permutation of a set of $𝑛$ distinct elements is an ordered selection of $𝑟$ elements arrangement from the set. The number of such permutations is denoted by $𝑃 (𝑛, 𝑘)$

Theorem 3.3.

𝑃 (𝑛, 𝑘) = \prod_{𝑖 = 𝑛 - 𝑘 + 1}^{𝑛} 𝑖 = \frac{𝑛!}{(𝑛 - 𝑘)!}

Definition 3.4 (Circular permutation). A circular $𝑘$ -permutation of a set is an ordered selection of $𝑟$ elements from $𝑆$ arranged as a circle.

Proposition 3.1. The number of circular $𝑘$ -permutations of an $𝑛$ -set equals $\frac{𝑃 (𝑛, 𝑘)}{𝑘}$ .

3.3.1 ○permutation of multisets

Generalized to multisets.

Definition 3.5. Let $𝑀$ be a multiset of type $(𝑛_{1}, \dots, 𝑛_{𝑘})$ ( $𝑘$ distinct objects; $𝑛_{𝑖}$ is the count of the $𝑖$ -th one). An $𝑟$ -permutation of the multiset $𝑀$ is an ordered arrangement of $𝑟$ objects from $𝑀$ , where the order among identical objects is disregarded.

Theorem 3.4. The number of $𝑛$ -permutation of a multiset of type $(𝑛_{1}, \dots, 𝑛_{𝑘})$ , where $𝑛_{1} + \dots + 𝑛_{𝑘} = 𝑛$ , is

\frac{𝑛!}{𝑛_{1}! \dots 𝑛_{𝑘}!} .

Proof. Direct proof: Divide the number of permutations within each type of objects.

Alternatively, we can view this problem as distributing $𝑛$ distinct positions to each type of objects, which becomes this combination problem 3.7. The number of ways is $(\binom{𝑛}{𝑛_{1}, \dots, 𝑛_{𝑘}})$ (See below). □

3.3.2 generating permutations

Method 3.1. We assign a direction to each integer $𝑘$ : $\overset{\leftarrow}{𝑘}$ or $\vec{𝑘}$ . A permutation where each integer is given a direction is called a directed permutation. An integer $𝑘$ in a directed permutation is called mobile if its arrow points to a smaller integer adjacent to it.

Now we present an algorithm to generate permutations of ${1, \dots, 𝑛}$ using directed permutations:

1.: Start with $\overset{\leftarrow}{1} \overset{\leftarrow}{2} \dots \overset{\leftarrow}{𝑛}$ .
2.: Find the largest mobile integer $𝑚$ .
3.: Swap $𝑚$ and the adjacent integer pointed to by the direction of $𝑚$ .
4.: Switch the directions for all integers $𝑝 > 𝑚$ .
5.: Write down the resulting permutation and return to step 2.
6.: Stop if there is no mobile integer.

E.g., $\overset{\leftarrow}{1} \overset{\leftarrow}{2} \overset{\leftarrow}{3}$ , $\overset{\leftarrow}{1} \overset{\leftarrow}{3} \overset{\leftarrow}{2}$ , $\overset{\leftarrow}{3} \overset{\leftarrow}{1} \overset{\leftarrow}{2}$ , $\vec{3} \overset{\leftarrow}{2} \overset{\leftarrow}{1}$ , $\overset{\leftarrow}{2} \vec{3} \overset{\leftarrow}{1}$ , $\overset{\leftarrow}{2} \overset{\leftarrow}{1} \vec{3}$ .

Proof: TODO

3.4 Combination

Definition 3.6 (Combination). A $𝑘$ -combination of a set of $𝑛$ distinct elements is unordered selection of $𝑘 \leq 𝑛$ elements from the set.

The number of $𝑘$ -combination of a set of $𝑛$ distinct elements is denoted by $(\binom{𝑛}{𝑘})$ (\binom{n}{k}).

Theorem 3.5.

(\binom{𝑛}{𝑘}) = \frac{𝑛!}{(𝑛 - 𝑘)! 𝑘!} = \frac{𝑛 (𝑛 - 1) \dots (𝑛 - 𝑘 + 1)}{𝑘!}

Proof. $𝑘$ -permutation can be obtained by ordering a $𝑘$ -combination. Consequently,

𝑃 (𝑛, 𝑘) = (\binom{𝑛}{𝑘}) 𝑃 (𝑘, 𝑘) .

From $𝑃 (𝑛, 𝑘)$ and $𝑃 (𝑘, 𝑘)$ , we can derive $(\binom{𝑛}{𝑘})$ . □

Corollary 3.2.

(\binom{𝑛}{𝑘}) = (\binom{𝑛}{𝑛 - 𝑘})

Proof: Combinatorially by bijection principle.

Bijection Principle:

Two sets $𝑈$ and $𝑉$ have the same size iff there is a bijection function $𝑓$ from $𝑈$ to $𝑉$ .

_______________________________________________________________________________________

Proposition 3.2. $(\binom{2 𝑛}{𝑛}) = \sum_{𝑘 = 0}^{𝑛} (\binom{𝑛}{𝑘}) (\binom{𝑛}{𝑛 - 𝑘}) = \sum_{𝑘 = 0}^{𝑛} {(\binom{𝑛}{𝑘})}^{2}$ .

_______________________________________________________________________________________

3.4.1 distinguishable boxes

This is another way to solve “Permutation with multisets”.

Definition 3.7. An ordered collection of $𝑘$ disjoint subsets $𝐴_{1}, \dots, 𝐴_{𝑘}$ of an $𝑛$ -set (a partition with order) is called a combination of $𝑛$ objects of type $(𝑛_{1}, \dots, 𝑛_{𝑘})$ if $𝑛_{1} = | 𝐴_{1} |, \dots, 𝑛_{𝑘} = | 𝐴_{𝑘} |$ and $𝑛 = 𝑛_{1} + \dots + 𝑛_{𝑘}$ . The number of such combinations is denoted by $(\binom{𝑛}{𝑛_{1}, \dots, 𝑛_{𝑘}})$ .

In other words, this is the number of ways to distribute $𝑛$ distinguishable objects to $𝑘$ distinguishable boxes so that the sizes are $(𝑛_{1}, \dots, 𝑛_{𝑘})$ .

Theorem 3.6.

(\binom{𝑛}{𝑛_{1}, \dots, 𝑛_{𝑘}}) = \frac{𝑛!}{𝑛_{1}! \dots 𝑛_{𝑘}!} .

Proof. Similar. Think of the segment $(\sum_{𝑗 = 1}^{𝑖 - 1} 𝑛_{𝑗}, \sum_{𝑗 = 1}^{𝑖} 𝑛_{𝑗}]$ as the objects distributed to the $𝑖$ -th box. The order within this segment (box) does not matter, so we divide the number of interior permutations. We can map such a partition uniquely to a combination of type $𝑛_{1}, \dots, 𝑛_{𝑘})$ , and vice versa.

Alternatively, it is equal to

(\binom{𝑛}{𝑛_{1}}) (\binom{𝑛 - 𝑛_{1}}{𝑛_{2}}) \dots (\binom{𝑛 - 𝑛_{1} - 𝑛_{2} - \dots 𝑛_{𝑘}}{𝑛_{𝑘}}) = \frac{𝑛!}{𝑛_{1}! \cdot \dots \cdot 𝑛_{𝑘}!} .

□

3.4.2 ○combination with repetition / combinations of multisets

Definition 3.8. An $𝑟$ -combination with repetition of a set $𝑆$ of $𝑛$ distinguishable objects is a selection of $𝑟$ elements where each object can be selected unlimited times. The number of $𝑟$ -combinations with repetition is denoted by $⟨ \binom{𝑛}{𝑟} ⟩$ .

Let $𝑀 = {∞𝑥 | 𝑥 \in 𝑆}$ be a multiset comprising infinitely many copies of objects in $𝑆$ . Then the combination can also be viewed as an $𝑟$ -combination of the multiset $𝑀$ (without repetition, in usual sense).

E.g., objects: ${1, 2, 3}$ , a valid selection of $4$ elements: ${1, 1, 3, 3}$ .

Proposition 3.3. $⟨ \binom{𝑛}{𝑟} ⟩$ is the number of ways to distribute $𝑟$ indistinguishable balls to $𝑛$ distinguishable boxes.

Interpretation: Every time choose one box to put in an object, with repetition allowed. Repeat $𝑟$ times.

Theorem 3.7.

⟨ \binom{𝑛}{𝑟} ⟩ = (\binom{𝑟 + 𝑛 - 1}{𝑟}) = (\binom{𝑟 + 𝑛 - 1}{𝑛 - 1}) .

Proof. Stars and Bars: This problem can be represented as a sequence of $𝑛 - 1$ bars and $𝑟$ stars. The $𝑛 - 1$ bars (together with head and tail of the sequence) mark $𝑛$ cells, which correspond to $𝑛$ objects respectively. The $𝑟$ stars correspond to $𝑟$ selections, and the number of stars in $𝑖$ -th cell represents the times the $𝑖$ -th object was selected. Note that there is a bijection from above sequences to $𝑟$ -combinations, then the answer is the number of ways to choose $𝑛 - 1$ OR $𝑟$ positions out of $𝑛 - 1 + 𝑟$ . □

Corollary 3.3. $⟨ \binom{𝑛}{𝑟} ⟩$ is the number of solutions for the following equation of non-negative variables: $𝑥_{1} + \dots + 𝑥_{𝑛} = 𝑟$ .

Interpretation: Every time choose one variable (initially 0) to add $1$ , with repetition allowed. Repeat $𝑟$ times.

Example 3.5.

Exercise

Find the number of ways to select $𝑚$ distinct numbers from $[𝑛]$ so that no two numbers differ by 1.

Solution 1

Use the difference between the selected elements to rewrite the condition. Let $𝑎_{1}, \dots, 𝑎_{𝑚}$ be the selected elements sorted in ascending order. Let $Δ_{𝑖} = 𝑎_{𝑖 + 1} - 𝑎_{𝑖}$ for $𝑖 = 1, \dots, 𝑚 - 1$ , $Δ_{0} = 𝑎_{1} - 0$ , and $Δ_{𝑚} = 𝑛 + 1 - 𝑎_{𝑚}$ . Then the conditions are equivalent to:

{\sum︁}_{𝑖 = 0}^{𝑚} Δ_{𝑖} = 𝑛 + 1 - 0, Δ_{𝑖} \geq 2 \forall 𝑖 \in [𝑚 - 1], Δ_{0}, Δ_{𝑚} \geq 1 .

The number of the solutions can be derived using the corollary above, $⟨ \binom{𝑛 + 1 - 2 𝑚}{𝑚} ⟩$ .

Solution 2

We add a number $𝑛 + 1$ and combine $𝑖 + 1$ with $𝑖$ for $𝑖 \in [𝑛]$ . Then, the condition is equivalent to choosing $𝑚$ (combined) items such that no two overlap. There are $(𝑛 + 1 - 2 𝑚)$ empty spaces left, so the number of ways equals to the number of arrangements of $𝑚$ (identical) dominos and $(𝑛 + 1 - 2 𝑚)$ (identical) empty spaces.

Example 3.6.

Exercise: Find the number of solutions to the system: $𝑥_{1} + \dots + 𝑥_{𝑛} \leq 𝑚, 𝑥_{𝑖} \geq 0 \forall 𝑖 \in [𝑛]$ .
Solution: Let $𝑥_{𝑛 + 1} = 𝑚 - (𝑥_{1} + \dots + 𝑥_{𝑚})$ . Then, the condition is equivalent to $𝑥_{𝑛 + 1} \geq 0$ . Thus, the system becomes $𝑥_{1} + \dots + 𝑥_{𝑛 + 1} = 𝑚, 𝑥_{𝑖} \geq 0 \forall 𝑖 \in [𝑛 + 1]$ . Note that the solutions to the two systems have one-to-one correspondence. The answer is $⟨ \binom{𝑛 + 1}{𝑚} ⟩$ .

3.4.3 ○generating combinations

That is, to generate the power set of a set.

Definition 3.9 (unit cube). The unit $𝑛$ -cube $𝑄_{𝑛}$ is a graph whose vertex set is the set of 0-1 sequences of length $𝑛$ , where two sequences are adjacent iff they differ in only one place.

Definition 3.10 (Gray code). A Gray code of order $𝑛$ is a path of $𝑄_{𝑛}$ that visits every vertex of $𝑄_{𝑛}$ exactly once, i.e., a Hamilton path of $𝑄_{𝑛}$ .

A Gray code is called cyclic if it is a closed path (the first vertex is visited twice), i.e., a Hamilton cycle of $𝑄_{𝑛}$ .

We want to generate the power set as a (special) Gray code.

Definition 3.11 (reflected Gray code). The Gray codes obtained in the following way are called reflected Gray codes. It’s based on induction:

Base: For $𝑛 = 1$ , $0 \to 1$ .

Inductive step: For $𝑛 \geq 2$ , let $𝑠_{1} \to \dots \to 𝑠_{2^{𝑛 - 1}}$ be the reflected Gray code of order $𝑛 - 1$ . Then, the one of order $𝑛$ is constructed as:

0 𝑠_{1} \to 0 𝑠_{2} \to \dots \to 0 𝑠_{2^{𝑛 - 1}} \to 1 𝑠_{2^{𝑛 - 1}} \to 1 𝑠_{2^{𝑛 - 1} - 1} \to \dots \to 1 𝑠_{1} .

It is easy to see that these are valid Gray codes.

Derivation 3.1. Note that, starting with $0 \dots 0$ , the sums of bits of the sequences in the reflected Gray code alternate in parity.

Based on the inductive construction above, we observe that:

Whenever the sume changes from even to odd, it corresponds to the very beginning $0 \to 1$ or the reflected version $1 \to 0$ , with some prefix.
Whenever the sum changes from odd to even, it corresponds to the joint of some reflection. Ignoring the prefix added later, at the joint, a 0 is added at the beginning of the previous sequence and a 1 for the next sequence, or the opposite occurs if the reflection is done an odd number of times.
Also, by the trait of reflection, the last sequence always differs with the first sequence in only the highest bit, i.e., it is $10 \dots 0$ (there can be no 0). Thus, one side of the joint must be $* 010 \dots 0$ and the other side $* 110 \dots 0$ . Therefore, we can find the rightmost index with 1, and flip the bit on the left to it.

Method 3.2. The following algorithm generates the reflected Gray code of order $𝑛$ in order:

1.: Begin with $𝑎_{𝑛} 𝑎_{𝑛 - 1} \dots 𝑎_{1} = 00 \dots 0$ .
2.: If $𝑎_{𝑛} 𝑎_{𝑛 - 1} \dots 𝑎_{1} = 10 \dots 0$ , stop.
3.: If $\sum_{𝑖 = 1}^{𝑛} 𝑎_{𝑖}$ is even, then flip $𝑎_{1}$ .
4.: If $\sum_{𝑖 = 1}^{𝑛} 𝑎_{𝑖}$ is odd, then find $𝑗 = {argmin}_{𝑖} {𝑎_{𝑖} = 1}$ and flip $𝑎_{𝑗 + 1}$ .
5.: Write down $𝑎$ and return to Step 2.

3.5 Binomial theorem and multinomial theorem

Theorem 3.8 (Binomial theorem and multinomial theorem).

(𝑥 + 𝑦)^{𝑛} = {∑︁}_{𝑖 = 0}^{𝑛} (\binom{𝑛}{𝑖}) 𝑥^{𝑖} 𝑦^{𝑛 - 𝑖},

(𝑥_{1} + \dots + 𝑥_{𝑘})^{𝑛} = \underset{\begin{array}{c} 𝑖_{1} + \dots + 𝑖_{𝑘} = 𝑛 \\ 𝑖_{1}, \dots, 𝑖_{𝑘} \geq 0 \end{array}}{∑︁} (\binom{𝑛}{𝑖_{1}, \dots, 𝑖_{𝑘}}) 𝑥_{1}^{𝑖_{1}} \dots 𝑥_{𝑘}^{𝑖_{𝑘}} .

Example 3.7. Let $𝑛$ be a positive integer. Then

\begin{array}{l} {\sum︁}_{𝑘 = 0}^{𝑛} (- 1)^{𝑘} {(\binom{𝑛}{𝑘})}^{2} = {\begin{matrix} 0, & if 𝑛 is odd \\ (- 1)^{𝑛 / 2} (\binom{𝑛}{𝑛 / 2}), & if 𝑛 is even \end{matrix} \end{array}

3.5.1 Generalized

Definition 3.12 (Binomial coefficients). For any real number $𝛼$ and integer $𝑘$ , define the binomial coefficients

(\binom{𝛼}{𝑘}) = {\begin{matrix} 0 & if 𝑘 < 0 \\ 1 & if 𝑘 = 0 \\ 𝛼 (𝛼 - 1) \dots (𝛼 - 𝑘 + 1) / 𝑘! & if 𝑘 > 0 \end{matrix} .

Proposition 3.4 (Reciprocity law of binomial coefficients).

(\binom{- 𝛼}{𝑘}) = (- 1)^{𝑘} (\binom{𝛼 + 𝑘 - 1}{𝑘}) .

Corollary 3.4. For any positive integer $𝑛$ ,

(\binom{- 𝑛}{𝑘}) = (- 1)^{𝑘} ⟨ \binom{𝑛}{𝑘} ⟩ .

Theorem 3.9 (Newton’s binomial expansion). Let $𝛼$ be a real number. If $| 𝑧 | < 1$ , then

(1 + 𝑧)^{𝛼} = {∑︁}_{𝑘 = 0}^{\infty} (\binom{𝛼}{𝑘}) 𝑧^{𝑘} .

Consequently, we have, for $| 𝑥 | < | 𝑦 |$ :

(𝑥 + 𝑦)^{𝛼} = {∑︁}_{𝑘 = 0}^{\infty} (\binom{𝛼}{𝑘}) 𝑥^{𝑘} 𝑦^{𝛼 - 𝑘} .

Proof. Apply the Taylor expansion for $(1 + 𝑧)^{𝛼}$ around $𝑧 = 0$ :

\begin{array}{l} (1 + 𝑧)^{𝛼} & = {∑︁}_{𝑘 = 0}^{\infty} (𝑧 - 0)^{𝑘} 𝑓^{𝑘} (0) \\ = {\sum︁}_{𝑘 = 0}^{\infty} 𝑧^{𝑘} 𝛼 (𝛼 - 1) \dots (𝛼 - 𝑘 + 1) / 𝑘! . \end{array}

TODO: It can be shown that the R.H.S. has a radius of convergence $1$ . Thus, the formula holds for $| 𝑧 | < 1$ .

Let $𝑧 : = 𝑥 / 𝑦$ . Then we can derive the second formula. □

3.5.2 Unimodality

Definition 3.13 (Unimodality). A sequence $𝑠_{0}, 𝑠_{1}, \dots, 𝑠_{𝑛}$ is said to be unimodal if there is an integer $𝑘$ ( $0 \leq 𝑘 \leq 𝑛$ ) such that $𝑠_{0} \leq 𝑠_{1} \leq \dots \leq 𝑠_{𝑘} \geq 𝑠_{𝑘 + 1} \geq \dots \geq 𝑠_{𝑛}$ .

E.g., the sequence of binomial coefficients (integer) is unimodal.

Definition 3.14 (log-concave). A sequence $𝑠_{0}, 𝑠_{1}, \dots, 𝑠_{𝑛}$ of positive numbers is said to be log-concave if $𝑠_{𝑖}^{2} \geq 𝑠_{𝑖 - 1} 𝑠_{𝑖 + 1}$ for $𝑖 = 1, \dots, 𝑛 - 1$ . This condition is equivalent to that the sequence $log 𝑠_{0}, log 𝑠_{1}, …, log 𝑠_{𝑛}$ is concave, i.e., $log 𝑠_{𝑖} \leq (log 𝑠_{𝑖 - 1} + log 𝑠_{𝑖 + 1}) / 2$ .

Proposition 3.5. Each log-concave (positive) sequence $𝑠_{0}, 𝑠_{1}, \dots, 𝑠_{𝑛}$ is unimodal.

Proof. $𝑠_{𝑖}^{2} \geq 𝑠_{𝑖 - 1} 𝑠_{𝑖 + 1}$ is equivalent to $\frac{𝑠_{𝑖}}{𝑠_{𝑖 - 1}} \geq \frac{𝑠_{𝑖 + 1}}{𝑠_{𝑖}}$ . Discuss if $1$ is between two adjacent ratios. □

3.6 Common Counting Methods

Four methods:

1.: Combinatorial reasoning: Show that the LHS and the RHS are different ways in counting the same thing.
2.: Binomial expansion: Equation (3.8).
3.: Direct computation
4.: Induction (on $𝑛$ and $𝑘$ etc.)

3.7 Other Properties of Permutations

3.7.1 ○inversions of permutations

Definition 3.15 (inversion, inversion sequence, total disorder, sign). Let $𝜋$ be a permutation of $[𝑛]$ .

An ordered pair $(𝜋_{𝑖}, 𝜋_{𝑗})$ , is called an inversion of $𝜋$ if $𝑖 < 𝑗$ and $𝜋_{𝑖} > 𝜋_{𝑗}$ .
Let $𝑎_{𝑘} = | {(𝜋_{𝑖}, 𝑘) : (𝜋_{𝑖}, 𝑘) is an inversion of 𝜋} | = | {𝑠_{𝑖} : 𝑖 < 𝜋^{- 1} (𝑘), 𝜋_{𝑖} > 𝑘} |$ . The tuple $(𝑎_{1}, \dots, 𝑎_{𝑛})$ is called the inversion sequence of $𝜋$ .
The total disorder of $𝜋$ is defined to be $inv (𝜋) = 𝑎_{1} + 𝑎_{2} + \dots + 𝑎_{𝑛}$ (total number of inversions).
The sign of $𝜋$ is defined to be $sign (𝜋) = (- 1)^{inv (𝜋)}$

Theorem 3.10. For every (valid) inversion sequence, we can recover the original permutation using Method 3.3 or Method 3.4.

Method 3.3. Let $(𝑎_{1}, \dots, 𝑎_{𝑛})$ be a valid inversion sequence.

1.: Mark down $𝑛$ empty spaces $□□…□$ .
2.: In the order $𝑖 = 1, 2, \dots, 𝑛$ , put $𝑖$ into the $(𝑎_{𝑖} + 1)$ -th empty space from the left.

Method 3.4. Let $(𝑎_{1}, \dots, 𝑎_{𝑛})$ be a valid inversion sequence.

1.: Write down $𝑛$ .
2.: In the order $𝑖 = 𝑛 - 1, 𝑛 - 2, \dots, 1$ , place $𝑖$ after the $𝑎_{𝑖}$ -th existing item from the left.

Proof: easy.

Example 3.8.

Exercise

Define a “move” of a permutation as selecting an element and inserting it somewhere in the permutation. Find an algorithm to compute the minimal number of moves to restore an arbitrary input permutation of $[𝑛]$ back to the natural order.

Solution

Observe that if we move two disjoint subsets $𝑆$ and $𝑉$ of $[𝑛]$ , then the relative order among $𝑆$ is the same as if we only perform the moves for $𝑆$ , and so as $𝑉$ .

Let $𝑓 (𝑎_{1}, \dots, 𝑎_{𝑛})$ be the minimal number of moves for $𝑎_{1}, \dots, 𝑎_{𝑛}$ . It is natural to consider first how to move $𝑛$ to the right place (similar for 1). This can be done by moving $𝑛$ directly or moving others to indirectly restore $𝑛$ (or some combination). Suppose $𝑎_{𝑘} = 𝑛$ .

By the observation, if we ever move $𝑛$ directly, we would better move it to the last position in one go, because no matter where we move $𝑛$ to the relative order of the rest does no change. This option costs $𝑓 (𝑎_{1}, \dots, 𝑎_{𝑘 - 1}, 𝑎_{𝑘 + 1}, \dots, 𝑎_{𝑛 - 1}) + 1$ moves.
Thus, the other option is to move others only. We have to move $𝑎_{𝑘 + 1}, \dots, 𝑎_{𝑛}$ to the left of $𝑎_{𝑘}$ because they are all smaller than $𝑎_{𝑘} = 𝑛$ . Again, by the observation, no matter how we move them, the relative order among $𝑎_{1}, \dots, 𝑎_{𝑘 - 1}$ does not change after the moves. We have to at least:
1.
move $𝑎_{𝑘 + 1}, \dots, 𝑎_{𝑛}$ and restore the order among them, which costs $𝑛 - 𝑘$ moves,
2.
and restore the order among $𝑎_{1}, \dots, 𝑎_{𝑘 - 1}$ , which costs $𝑓 (𝑎_{1}, \dots, 𝑎_{𝑘 - 1})$ moves.

It turns out that these are enough because we can do the 2nd part first followed by the 1st part, in which we restore the order across the two subsets by the way.

Thus, $𝑓 (𝑎_{1}, \dots, 𝑎_{𝑛}) = min (𝑓 (𝑎_{1}, …, 𝑎_{𝑘 - 1}, 𝑎_{𝑘 + 1}, …, 𝑎_{𝑛 - 1}) + 1, 𝑓 (𝑎_{1}, …, 𝑎_{𝑘 - 1}) + 𝑛 - 𝑘$ .

3.7.2 ○cycles of permutations

Definition 3.16 (Cycle). Let $𝑥_{1}, \dots, 𝑥_{𝑘}$ be $𝑘$ distinct elements in $[𝑛]$ . We denote by $(𝑥_{1} 𝑥_{2} \dots 𝑥_{𝑘})$ the cycle $𝑥_{1} \to 𝑥_{2} \to \dots \to 𝑥_{𝑘} \to 𝑥_{1}$ . Additionally, we view this as a function $[𝑛] \to [𝑛]$ mapping elements on the cycle to the next element and mapping other elements to themselves:

(𝑥_{1} 𝑥_{2} \dots 𝑥_{𝑘}) (𝑥) = {\begin{matrix} 𝑥_{𝑖 𝑚𝑜𝑑 𝑘 + 1}, & if \exists 𝑖, 𝑥 = 𝑥_{𝑖} \\ 𝑥, & otherwise \end{matrix} .

Remark: A cycle $(𝑥_{1} 𝑥_{2} \dots 𝑥_{𝑘})$ is invariant to “rotation” or “cyclical permutation”. E.g., $(𝑥_{𝑘} 𝑥_{1} \dots 𝑥_{𝑘 - 1})$ still represents the same cycle.

_______________________________________________________________________________________

Definition 3.17 (Transposition). A transposition is a cycle of length 2. An adjacent transposition is a transposition with adjacent elements, of the form $(𝑖𝑗)$ , where $𝑗 = 𝑖 \pm 1$ .

Derivation 3.2. Let $(𝑖𝑗)$ be a non-adjacent transposition. Since $(𝑖𝑗) = (𝑗𝑖)$ , WLOG, we can assume $𝑗 - 𝑖 \geq 2$ . We try to insert some intermediate element $𝑘 \in (𝑖, 𝑗)$ in between. If we break $(𝑖𝑗)$ into $(𝑗𝑘) (𝑘𝑖)$ , then $𝑖$ can be successfully swapped to position $𝑗$ , but this introduces the position change of $𝑘$ and puts $𝑗$ wrongly on position $𝑖$ . Then, we need to add another $(𝑘𝑖)$ on the left to correct this. Thus, $(𝑖𝑗) = (𝑘𝑖) (𝑗𝑘) (𝑘𝑖)$ .

Generally, one can first use $(𝑗, 𝑗 - 1) (𝑗 - 1, 𝑗 - 2) \dots (𝑖 + 1, 𝑖)$ to swap $𝑖$ to the target position $𝑗$ , with the effect of moving $𝑖 + 1, \dots, 𝑗$ to one position smaller.

Then, one can use $(𝑖 + 1, 𝑖) (𝑖 + 2, 𝑖 + 1) \dots (𝑗 - 1, 𝑗 - 2)$ to move $𝑖 + 1, \dots, 𝑗 - 1$ back, by the way swapping $𝑗$ to $𝑖$ .

In summary,

(𝑖𝑗) = (𝑖 + 1, 𝑖) (𝑖 + 2, 𝑖 + 1) \dots (𝑗 - 1, 𝑗 - 2) (𝑗, 𝑗 - 1) (𝑗 - 1, 𝑗 - 2) \dots (𝑖 + 1, 𝑖)

Theorem 3.11. Any transposition can be represented as a composition of adjacent transpositions.

_______________________________________________________________________________________

Definition 3.18 (Standard Cycle Notation). Let $𝜋$ be a permutation of $[𝑛]$ . We derive the standard cycle notation as follows:

1.: Viewing $𝑥 \mapsto 𝜋 (𝑥)$ as directed edges, find all the cycles.
2.: Standardize each cycle by making the largest element in the first place. E.g., $(238)$ becomes $(823)$ .
3.: Compose all standardized cycles (functions) in ascending order of the leading elements.

The resulting function is called the standard cycle notation of that permutation, denoted by $cyc (𝜋)$ . Obviously, we have $𝜋 (𝑖) = cyc (𝜋) (𝑖)$ .

E.g., $43865712 \mapsto (5) (7146) (823)$ .

Figure 5: standard cycle notation example

Proposition 3.6. We can remove the parentheses in a standard cycle notation and still recover the permutation.

Proof. In each cycle, the leading element should be strictly greater than the following elements. On the other hand, the leading elements are increasing. So we can identify the end of a cycle by finding the first element greater than the leading element. □

Definition 3.19. Let $\hat{cyc} (𝜋)$ , where $𝜋 \in 𝔖_{𝑛}$ , denote the permutation of $[𝑛]$ obtained from $cyc (𝜋)$ by deleting all the parentheses.

Derivation 3.3. Let $rec$ denote the recovering function in the proposition above. $rec$ is by definition an inverse function of $\hat{cyc}$ (for any permutation $𝜋$ , $rec (\hat{cyc} (𝜋)) = 𝜋$ ). Thus, the latter one is a bijection.

Proposition 3.7. $\hat{cyc} : 𝔖_{𝑛} \to 𝔖_{𝑛}$ is bijection.

4 Counting Applications

4.1 ○Partially ordered sets

4.1.1 Sperner’s Theorem

Consider chains and anti-chains of subsets of $𝑆$ .

Theorem 4.1 (Sperner’s Theorem). Every anti-chain of subsets of an $𝑛$ -element set $𝑆$ has size at most $(\binom{𝑛}{⌊ 𝑛 / 2 ⌋})$ and it is achievable.

Proof. Let $𝐴$ be an arbitrary anti-chain of subsets of $𝑆$ and $𝐶$ an arbitrary chain. It is easy to easy that $| 𝐴 \cap 𝐶 |$ is either 0 or 1 from the properties the intersection has to satisfy.

Now consider an arbitrary partition $𝐶$ of the power set $𝑃 (𝑆)$ into disjoint chains $𝐶_{1}, \dots, 𝐶_{𝑚}$ . Then,

\begin{array}{l} 𝐴 = 𝐴 \cap 𝑃 (𝑆) = 𝐴 \cap ⊔_{𝑖 = 1}^{𝑚} 𝐶_{𝑖} = ⊔_{𝑖 = 1}^{𝑚} 𝐴 \cap 𝐶_{𝑖} . \end{array}

It follows that $| 𝐴 | \leq | 𝐶 | = 𝑚$ , which is the number of disjoint chains.

To find the $𝐴$ with the largest size, it suffices to construct an $𝐴$ and a partition such that their sizes are equal (it turned out to be feasible). The construction is done by induction on $𝑛$ :

Base: $𝑛 = 1$ . Only one chain $𝐶_{1} = {\emptyset, {1}}$ .
Induction: Based on the chain partition for $𝑛 - 1$ .
- For every chain, append a new subset at the end which is obtained by adding $𝑛$ to the last subset of the chain. ( ${𝑐_{1}, \dots, 𝑐_{𝑘}} \to {𝑐_{1}, \dots, 𝑐_{𝑘}, 𝑐_{𝑘} \cup {𝑛}}$ )
- For every chain with more than $1$ element, delete the last subset and add $𝑛$ to each subset. ( ${𝑐_{1}, \dots, 𝑐_{𝑘}} \to {𝑐_{1} \cup {𝑛}, \dots, 𝑐_{𝑘} \cup {𝑛}}$ )
Clearly, they are valid chains. Additionally, all possible subsets containing $𝑛$ have been covered, so it is also a valid chain partition. P.S., this partition is a symmetric chain partion because it has the following two properties:
- It is saturated: No subset can be added in between any two consecutive subsets.
- It is symmetric: The size of the first subset plus the size of the last subset equals $𝑛$ .

On the other hand, if we pick the subset with size $⌈ 𝑛 / 2 ⌉$ in each chain (can be shown that it exists and is unique), they form an antichain (can be shown using the size). The length of this antichain is equal to the size of the chain partition, which is equal to $(\binom{𝑛}{⌈ 𝑛 / 2 ⌉})$ because all subsets of size $⌈ 𝑛 / 2 ⌉$ are included here. □

4.1.2 ○Dilworth Theorem

Lemma 4.1. Let $(𝑋, ≼)$ be a poset. For any chain $𝐶$ and anti-chain $𝐴$ of $𝑋$ ,

| 𝐴 \cap 𝐶 | \in {0, 1} .

Lemma 4.2. Let $(𝑋, ≼)$ be a finite poset (as we are counting the elements). For any chain $𝐶$ of $𝑋$ and anti-chain partition $𝐴$ of $𝑋$ ,

| 𝐶 | \leq | 𝐴 | .

Likewise, for any anti-chain $𝐴$ of $𝑋$ and chain-partition $𝐶$ of $𝑋$ ,

| 𝐴 | \leq | 𝐶 | .

Proof. $| 𝐶 | = | 𝐶 \cap 𝑋 | = | 𝐶 \cap (𝑈_{𝐴 \in 𝐴} 𝐴) | = | \cup_{𝐴 \in 𝐴} 𝐶 \cap 𝐴 | = \sum_{𝐴 \in 𝐶} | 𝐶 \cap 𝐴 | \leq \sum_{𝐴 \in 𝐴} = | 𝐴 |$ . Similar for the other one. □

The following theorems state that the equality can be achieved.

Theorem 4.2. Let $(𝑋, ≼)$ be a finite poset. Then,

max {| 𝐶 | : 𝐶 chains} = min {| 𝐴 | : 𝐴 anti-chain partitions} .

Proof. [TODO: ] □

Theorem 4.3 (Dilworth theorem). Let $(𝑋, ≼)$ be a finite poset. Then,

max {| 𝐴 | : 𝐴 chains} = min {| 𝐶 | : 𝐶 chain partitions} .

Proof. [TODO: ] □

4.2 Graph

Definition 4.1 (coloring). A coloring of a graph with $𝑡$ colors is a function $: 𝑉 \to [𝑡]$ .

Definition 4.2. A coloring is called proper if no two adjacent nodes receive the same color. The number of proper colorings is a polynomial function of $𝑡$ , called the chromatic polynomial of $𝐺$ , denoted by $𝜒 (𝐺, 𝑡)$ . Thus, it can be written in the form

𝜒 (𝐺, 𝑡) = {\sum︁}_{𝑘 = 0}^{𝑛} (- 1)^{𝑛 - 𝑘} 𝑎_{𝑘} 𝑡^{𝑘} .

4.3 Ramsey Theory

4.3.1 definitions

We denote a complete graph of $𝑛$ vertices by $𝐾_{𝑛}$ .

Theorem 4.4 (Ramsey Number). Given $𝑟, 𝑏 \geq 1$ . A (edge-)coloring of $𝐾_{𝑛}$ by colors $1, 2$ is said to satisfy the Ramsey property of type $(𝑟, 𝑏)$ if there exists

a subgraph $𝐾_{𝑟}$ of color $1$ , OR
a subgraph $𝐾_{𝑏}$ of color $2$ .

Ramsey theorem states that there exists a smallest integer, $𝑛 = 𝑅 (𝑟, 𝑏)$ , called the Ramsey number, such that, any coloring of $𝐾_{𝑛}$ satisfies the Ramsey property. This condition on $𝑛$ is sometines denoted as $𝐾_{𝑛} \to 𝐾_{𝑟}, 𝐾_{𝑏}$ .

No proof.

_______________________________________________________________________________________

Theorem 4.5 (Ramsey Theorem, more colors). Given $𝑞_{1}, \dots, 𝑞_{𝑘} \geq 1$ . A $𝑘$ -(edge-)coloring of $𝐾_{𝑛}$ by colors $𝑐_{1}, \dots, 𝑐_{𝑘}$ is said to satisfy the Ramsey property of type $(𝑞_{1}, \dots, 𝑞_{𝑘})$ if there exists

a subgraph $𝐾_{𝑞_{1}}$ of color $𝑐_{1}$ , OR
a subgraph $𝐾_{𝑞_{2}}$ of color $𝑐_{2}$ , OR
…, OR
a subgraph $𝐾_{𝑞_{𝑘}}$ of color $𝑞_{𝑘}$ .

Ramsey theorem states that there exists a smallest integer, $𝑛 = 𝑅 (𝑞_{1}, \dots, 𝑞_{𝑘})$ , called the Ramsey number, such that, any $𝑘$ -coloring of $𝐾_{𝑛}$ satisfies the Ramsey property. This condition on $𝑛$ is sometines denoted as $𝐾_{𝑛} \to 𝐾_{𝑞_{1}}, \dots 𝐾_{𝑞_{𝑘}}$ .

No proof

_______________________________________________________________________________________

We consider $𝑑$ -uniform hypergraph, in which edges are $𝑑$ -element subsets of the vertex set (i.e., undirectional). A $𝑑$ -uniform complete graph with vertex set $𝑉$ has the collection of all $𝑑$ -element subsets of $𝑉$ as its edge set. It’s called a complete $𝑑$ -family of size $| 𝑉 |$ , denoted by $𝑃_{𝑑} (𝑉) : = {𝑆 \subset 𝑉 | | 𝑆 | = 𝑑}$ . The graph itself is denoted by $𝐾_{| 𝑉 |}^{𝑑}$ .

Theorem 4.6 (Ramsey Theorem, hypergraph). Given $𝑞_{1}, \dots, 𝑞_{𝑘} \geq 𝑑 \geq 1$ . A $𝑘$ -(edge-)coloring of $𝐾_{𝑛}^{𝑑}$ by colors $𝑐_{1}, \dots, 𝑐_{𝑘}$ is said to satisfy the Ramsey property of type $(𝑞_{1}, \dots, 𝑞_{𝑘})$ if there exists

a subgraph $𝐾_{𝑞_{1}}^{𝑑}$ of color $𝑐_{1}$ , OR
a subgraph $𝐾_{𝑞_{2}}^{𝑑}$ of color $𝑐_{2}$ , OR
…, OR
a subgraph $𝐾_{𝑞_{𝑘}}^{𝑑}$ of color $𝑞_{𝑘}$ .

Ramsey theorem states that there exists a smallest integer, $𝑛 = 𝑅_{𝑑} (𝑞_{1}, \dots, 𝑞_{𝑘})$ , called the Ramsey number, such that, any $𝑘$ -coloring of $𝐾_{𝑛}^{𝑑}$ satisfies the Ramsey property. This condition on $𝑛$ is sometines denoted as $𝐾_{𝑛}^{𝑑} \to 𝐾_{𝑞_{1}}^{𝑑}, \dots 𝐾_{𝑞_{𝑘}}^{𝑑}$ .

Proof: See a partial proof below.

4.3.2 properties

Proposition 4.1. $𝑅 (3, 3) = 6$ .

Proof. Pigeon principle. □

Proposition 4.2.

$𝑅 (𝑞, 2) = 𝑅 (𝑞) = 𝑞$ for $𝑞 \geq 1$ .
$𝑅 (𝑞_{1}, \dots, 𝑞_{𝑘}, 2) = 𝑅 (𝑞_{1}, \dots, 𝑞_{𝑘})$ for $𝑞_{1}, \dots, 𝑞_{𝑘} \geq 1$ .
$𝑅_{𝑡} (𝑞_{1}, \dots, 𝑞_{𝑘}, 𝑡) = 𝑅_{𝑡} (𝑞_{1}, \dots, 𝑞_{𝑘})$ for $𝑞_{1}, \dots, 𝑞_{𝑘} \geq 1$ .

Proof.

If $𝑞 < 2$ , it’s easy to verify. Next, we consider $𝑞 \geq 2$ .
We have to simultaneously satisfies the following two cases of coloring:
1.
If at least one edge is colored by color 2, the property is satisfied.
2.
Otherwise, if all edges are colored by color 1, there has to be a subgraph $𝐾_{𝑞}$ (it’s in color 1 automatically).

Since the Ramsey number is clearly $\geq 2$ , which always satisfies case 1, the condition becomes exactly equivalent to the condition for case 2. That is, $𝑅 (𝑞, 2) = 𝑅 (𝑞)$ .
In summary, we have $𝑅 (𝑞, 2) = 𝑅 (𝑞)$ .
Similar. Generalize “case 2” in the previous proof.
Similar. Generalize the notion of “edges” in “case 1”.
Trivial when $𝑞_{1}, \dots, 𝑞_{𝑘} < 𝑡$ . Now, we only consider $𝑞_{1}, \dots, 𝑞_{𝑘} \geq 𝑡$ .
We have to satisfy the following two cases of coloring:
1.
If at least one edge (i.e., a $𝑑$ -subset) is colored by color $(𝑘 + 1)$ , then the property is already satisfied.
2.
Otherwise, no edges are in color $(𝑘 + 1)$ . The condition becomes the one for $(𝑞_{1}, \dots, 𝑞_{𝑘})$ .

Since the Ramsey number has to be at least $𝑡$ , which satisfies case 1, the condition is the same as the one for case 2. Thus, $𝑅 (𝑞_{1}, \dots, 𝑞_{𝑘}, 𝑡) = 𝑅 (𝑞_{1}, \dots, 𝑞_{𝑘})$ .
This proves the proposition.

□

_______________________________________________________________________________________

Proposition 4.3. $𝑅 (𝑝, 𝑞) \leq 𝑅 (𝑝 - 1, 𝑞) + 𝑅 (𝑞, 𝑝 - 1) .$

Proof. It can be easily proved using Lemma 4.3. Here, we give a proof purely based on the definition.

Let $𝑛 = 𝑅 (𝑝 - 1, 𝑞) + 𝑅 (𝑞, 𝑝 - 1)$ . We want to show that any coloring of $𝐾_{𝑛}$ satisfies the property. Fix a coloring. Consider choosing an arbitrary vertex $𝑣$ . Let $𝑉_{1} = {𝑤 \in 𝑉 | 𝑢𝑤 is in color 1}$ and define $𝑉_{2}$ analogously. By Pigeonhole Principle, either $| 𝑉_{1} | \geq 𝑅 (𝑝 - 1, 𝑞)$ or $| 𝑉_{2} | \geq 𝑅 (𝑝, 𝑞 - 1)$ . WLOG, we assume the first case. Therefore,

If there exists a color-1 $𝐾_{𝑝 - 1}$ in $𝑉_{1}$ , then by adding $𝑣$ to $𝑉_{1}$ we get a color-1 $𝐾_{𝑝}$ .
Otherwise, there exists a color-2 $𝐾_{𝑞}$ in $𝑉_{1}$ .

Thus, $𝑛$ satisfies the property. □

Corollary 4.1. $𝑅 (𝑝, 𝑞) \leq (\binom{𝑝 + 𝑞 - 2}{𝑝 - 1})$ .

Proof. Induction on the sum of $𝑝$ and $𝑞$ . (Assume it is true for $𝑝 + 𝑞 < 𝑛$ , …) □

Proposition 4.4. Generalizing the proposition above, we obtain:

𝑅 (𝑞_{1}, \dots, 𝑞_{𝑘}) - 1 \leq ({\sum︁}_{𝑖 = 1}^{𝑘} 𝑅 (𝑞_{1}, \dots, 𝑞_{𝑖 - 1}, 𝑞_{𝑖} - 1, 𝑞_{𝑖 + 1}, \dots, 𝑞_{𝑘})) - 𝑘 + 1 .

E.g., $𝑅 (3, 3, 3) \leq 17$ .

_______________________________________________________________________________________

Proposition 4.5. The $𝑡 = 1$ case is just Thm 3.2 Pigeonhole principle, strong form:

𝑅_{1} (𝑞_{1}, \dots, 𝑞_{𝑘}) = 𝑞_{1} + \dots + 𝑞_{𝑘} - 𝑘 + 1 .

Lemma 4.3.

𝑅_{𝑡} (𝑝, 𝑞) = 𝑅_{𝑡 - 1} [𝑅_{𝑡} (𝑝 - 1, 𝑞), 𝑅_{𝑡} (𝑝, 𝑞 - 1)] + 1 .

Proof: TODO

Proposition 4.6.

𝑅_{𝑡} (𝑞_{1}, \dots, 𝑞_{𝑘}) \leq 𝑅_{𝑡} [𝑞_{1}, \dots, 𝑞_{𝑘 - 2}, 𝑅_{𝑡} (𝑞_{𝑘 - 1}, 𝑞_{𝑘})] .

Proof. Consider $𝑛 = 𝑅_{𝑡} [𝑞_{1}, \dots, 𝑞_{𝑘 - 2}, 𝑅_{𝑡} (𝑞_{𝑘 - 1}, 𝑞_{𝑘})]$ . Any □

4.3.3 TODO proof of the theorem

5 Inclusion-exclusion principle

Theorem 5.1. Let $𝑆$ be a finite set. Let $𝑃_{1}, \dots, 𝑃_{𝑛}$ be properties referring to the objects of $𝑆$ . Let $𝐴_{𝑖}$ denote the set of elements satisfying $𝑃_{𝑖}$ in $𝑆$ . The number of elements in $𝑆$ satisfying none of the properties $𝑃_{1}, \dots, 𝑃_{𝑛}$ is given by

| \bar{𝐴_{1}} \cap \dots \cap \bar{𝐴_{𝑛}} | = | 𝑆 | - \underset{𝑖}{\sum︁} | 𝐴_{𝑖} | + \underset{𝑖 < 𝑗}{\sum︁} | 𝐴_{𝑖} \cap 𝐴_{𝑗} | + \dots + (- 1)^{𝑛} | 𝐴_{1} \cap \dots \cap 𝐴_{𝑛} | .

Proof. □

5.1 weighted version

Let $𝑆$ be a set (either finite or infinite). Let $R^{𝑆}$ denote the set of functions from $𝑆$ to $R$ . $R^{𝑆}$ is a vector space (with usual addition and scalar multiplication). Moreover, $R^{𝑆}$ is an algebra with usual product and identity $1_{𝑆}$ .

Let $𝑤 : 𝑆 \to R$ be a weight function which is nonzero at only finitely many points.

The weight of a subset $𝐴 \subset 𝑆$ is defined as $𝑤 (𝐴) = \sum_{𝑥 \in 𝐴} 𝑤 (𝑥)$ .
The weight of a function $𝑓 : 𝑆 \to R$ is defined as $𝑤 (𝑓) = \sum_{𝑥 \in 𝑆} (𝑤 \cdot 𝑓) (𝑥)$ (finite sum over the support).
We can verify that, for functions $𝑓_{𝑖}$ and constants $𝑐_{𝑖}$ ( $𝑖 \in [𝑚]$ ), $𝑤 (\sum_{𝑖 = 1}^{𝑚} 𝑐_{𝑖} 𝑓_{𝑖}) = \sum_{𝑖 = 1}^{𝑚} 𝑐_{𝑖} 𝑤 (𝑓_{𝑖})$ . Thus, $𝑤$ is a linear function from $R^{𝑆}$ to $R$ , i.e., a linear functional on $R^{𝑆}$ .

Theorem 5.2. Let $𝑆$ be a finite or infinite set. Let $𝑃_{1}, \dots, 𝑃_{𝑛}$ be properties referring to the objects of $𝑆$ . Let $𝐴_{𝑖}$ denote the set of elements satisfying $𝑃_{𝑖}$ in $𝑆$ . Then,

1_{\bar{𝐴_{1}} \cap \dots \cap \bar{𝐴_{𝑛}}} = 1_{𝑆} + {\sum︁}_{𝑘 = 1}^{𝑛} (- 1)^{𝑘} \underset{𝑖_{1} < \dots < 𝑖_{𝑘}}{∑︁} 1_{𝐴_{𝑖_{1}} \cap \dots \cap 𝐴_{𝑖_{𝑘}}} .

Given a weight function $𝑤$ on $𝑆$ , as a corollary, we have

𝑤 (\bar{𝐴_{1}} \cap \dots \cap \bar{𝐴_{𝑛}}) = 𝑤 (𝑆) + {\sum︁}_{𝑘 = 1}^{𝑛} (- 1)^{𝑘} \underset{𝑖_{1} < \dots < 𝑖_{𝑘}}{∑︁} 𝑤 (𝐴_{𝑖_{1}} \cap \dots \cap 𝐴_{𝑖_{𝑘}}) .

Proof. [TODO: ] □

Let $𝑆$ be a finite set and $𝐴_{1}, \dots, 𝐴_{𝑛}$ be subsets of $𝑆$ . We define two functions $𝛼$ and $𝛽$ on $𝑃 ([𝑛])$ as follows:

\begin{matrix} \begin{aligned} 𝛼 (\emptyset) = 0, 𝛼 (𝐼) = 𝑤 (\cap_{𝑖 \in 𝐼} 𝐴_{𝑖}) if 𝐼 \neq \emptyset \\ 𝛽 (\emptyset) = 0, 𝛽 (𝐼) = 𝑤 (\cup_{𝑖 \in 𝐼} 𝐴_{𝑖}) if 𝐼 \neq \emptyset \end{aligned} . \end{matrix}

Derivation 5.1. Applying the weighted inclusion-exclusion principle ((5.2)) to $𝐴_{𝑗}$ ( $𝑗 \in 𝐽$ ), we have

\begin{array}{l} 𝑤 (\cap_{𝑗 \in 𝐽} \bar{𝐴_{𝑖}}) & = 𝑤 (𝑆) + \underset{𝐼 \subset 𝐽, 𝐼≠∅}{\sum︁} (- 1)^{| 𝐼 |} 𝑤 (\cap_{𝑖 \in 𝐼} 𝐴_{𝑖}) \\ 0 & = 𝑤 (\cup_{𝑗 \in 𝐽} 𝐴_{𝑖}) + \underset{𝐼 \subset 𝐽, 𝐼≠∅}{\sum︁} (- 1)^{| 𝐼 |} 𝑤 (\cap_{𝑖 \in 𝐼} 𝐴_{𝑖}) \end{array}

Applying it to $\bar{𝐴_{𝑗}}$ ( $𝑗 \in 𝐽$ ) (complement of the properties), we have

\begin{array}{l} 𝑤 (\cap_{𝑗 \in 𝐽} 𝐴_{𝑖}) & = 𝑤 (𝑆) + \underset{𝐼 \subset 𝐽, 𝐼≠∅}{\sum︁} (- 1)^{| 𝐼 |} 𝑤 (\cap_{𝑖 \in 𝐼} \bar{𝐴_{𝑖}}) \\ = 𝑤 (𝑆) + \underset{𝐼 \subset 𝐽, 𝐼≠∅}{\sum︁} (- 1)^{| 𝐼 |} 𝑤 (𝑆 - \cup_{𝑖 \in 𝐼} 𝐴_{𝑖}) \\ = \underset{= 0}{\underset{⏟}{{\sum︁}_{𝑘 = 0}^{𝑛} (\binom{𝑛}{𝑘}) (- 1)^{𝑘} 𝑤 (𝑆)}} + \underset{𝐼 \subset 𝐽, 𝐼≠∅}{\sum︁} (- 1)^{| 𝐼 | + 1} 𝑤 (\cup_{𝑖 \in 𝐼} 𝐴_{𝑖}) \end{array}

Adding $𝛼 (\emptyset)$ and $𝛽 (\emptyset)$ to the above results, we get:

Proposition 5.1. For any $𝐽 \subset [𝑛]$ ,

𝛽 (𝐽) = \underset{𝐼 \subset 𝐽}{\sum︁} (- 1)^{| 𝐼 | + 1} 𝛼 (𝐼),

and

𝛼 (𝐽) = \underset{𝐼 \subset 𝐽}{\sum︁} (- 1)^{| 𝐼 | + 1} 𝛽 (𝐼),

5.2 derangements

Definition 5.1. A permutation of $[𝑛]$ is called a derangement if no $𝑖 \in [𝑛]$ is placed at the $𝑖$ -th position. The number of derangements of $[𝑛]$ is denoted by $𝐷_{𝑛}$ .

Proposition 5.2. The derangement sequence $𝐷_{𝑛}$ satisfies the recurrence relation

𝐷_{𝑛} = (𝑛 - 1) (𝐷_{𝑛 - 1} + 𝐷_{𝑛 - 2}), 𝑛 \geq 3

with the initial conditions $𝐷_{1} = 0, 𝐷_{2} = 1$ .

Proof. Consider how to satisfy the derangement requirement for the $𝑛$ -th position. We can put any one from $[𝑛 - 1]$ at the $𝑛$ -th position. Fix $𝑘 \in [𝑛 - 1]$ . There are two cases:

Simply swap $𝑘$ and $𝑛$ . Then the remaining $(𝑛 - 2)$ positions have to satisfy the derangement requirement.
Otherwise, $𝑘$ is put at the $𝑛$ -th position but $𝑛$ is not allowed at the $𝑘$ -th position. Then, the first $(𝑛 - 1)$ positions have to satisfy the derangment requirement, with $𝑛$ viewed as $𝑘$ .

Thus, $𝐷_{𝑛} = (𝑛 - 1) (𝐷_{𝑛 - 2} + 𝐷_{𝑛 - 1})$ . □

Derivation 5.2. Let $𝑃_{𝑖}$ denote the property that a permutation of $[𝑛]$ has the integer $𝑖$ in the $𝑖$ -th position. Let $𝐴_{𝑖}$ denote the set of all permutations satisfying $𝑃_{𝑖}$ . We want to count $\bar{𝐴_{1}} \cap \bar{𝐴_{2}} \cap \dots \cap \bar{𝐴_{𝑛}}$ .

For $1 \leq 𝑖_{1} < \dots < 𝑖_{𝑘} \leq 𝑛$ , $| 𝐴_{𝑖_{1}} \cap \dots \cap 𝐴_{𝑖_{𝑘}} | = (𝑛 - 𝑘)!$ . By the Inclusion-Exclusion principle, we have

\begin{array}{l} 𝐷_{𝑛} & = | \bar{𝐴_{1}} \cap \bar{𝐴_{2}} \cap \dots \cap \bar{𝐴_{𝑛}} | \\ = | 𝑆 | + {\sum︁}_{𝑘 = 1}^{𝑛} (- 1)^{𝑘} \underset{𝑖_{1} < \dots < 𝑖_{𝑘}}{∑︁} | 𝐴_{𝑖_{1}} \cap \dots \cap 𝐴_{𝑖_{𝑘}} | \\ = 𝑛! + {\sum︁}_{𝑘 = 1}^{𝑛} (- 1)^{𝑘} (\binom{𝑛}{𝑘}) (𝑛 - 𝑘)! \\ = 𝑛! {\sum︁}_{𝑘 = 0}^{𝑛} \frac{(- 1)^{𝑘}}{𝑘!} . \end{array}

Theorem 5.3. For $𝑛 \geq 1$ , the number of derangements of $[𝑛]$ is

𝐷_{𝑛} = 𝑛! {\sum︁}_{𝑘 = 0}^{𝑛} \frac{(- 1)^{𝑘}}{𝑘!} \approx \frac{𝑛!}{𝑒} .

5.2.1 [TODO: circular non-consecutive permutations]

5.3 permutation with forbidden positions

Definition 5.2. Let $𝑋_{1}, \dots, 𝑋_{𝑛}$ be subsets of $[𝑛]$ . The set of permutations $𝑎_{1} 𝑎_{2} \dots 𝑎_{𝑛}$ of $[𝑛]$ such that

𝑎_{1} \notin 𝑋_{1}, \dots, 𝑎_{𝑛} \notin 𝑋_{𝑛}

is denoted by $𝑃 (𝑋_{1}, \dots, 𝑋_{𝑛})$ .

Definition 5.3. A placement of $𝑛$ indistinguishable rooks on an $𝑛$ -by- $𝑛$ board is called non-attacking if no two rooks share the same column or row.

For convenience, let us denote by $𝑆$ the set of placements of $𝑛$ non-attacking indistinguishable rooks on an $𝑛 \times 𝑛$ -board. It is clear that there is a one-to-one correspondence between permutations of $[𝑛]$ and placements in $𝑆$ by placing the $𝑖$ -th rook to $(𝑖, 𝑎_{𝑖})$ . A rook placement in $𝑆$ is said to satisfy the property $𝑃_{𝑖}$ if the rook in the $𝑖$ -th row has column index in $𝑋_{𝑖}$ . Let $𝐴_{𝑖}$ denote the set of rook placements satisfying $𝑃_{𝑖}$ . Then,

\begin{array}{l} | 𝑃 (𝑋_{1}, \dots, 𝑋_{𝑛}) | & = | \bar{𝐴_{1}} \cap \dots \cap \bar{𝐴_{𝑛}} | \\ = | 𝑆 | + {\sum︁}_{𝑘 = 1}^{𝑛} (- 1)^{𝑘} \underset{𝑖_{1} < \dots < 𝑖_{𝑘}}{∑︁} | 𝐴_{𝑖_{1}} \cap \dots \cap 𝐴_{𝑖_{𝑘}} | . \end{array}

5.4 [TODO: nonconsecutive permutations]

Definition 5.4. A permutation of $[𝑛]$ is called nonconsecutive if the consecutive words of length two

12, 23,, \dots, (𝑛 - 1) 𝑛

do not occur. We denote by $𝑄_{𝑛}$ the number of nonconsecutive permutations of $[𝑛]$ .

Theorem 5.4. For $𝑛 \geq 1$ ,

𝑄_{𝑛} = {\sum︁}_{𝑘 = 0}^{𝑛 - 1} (- 1)^{𝑘} (\binom{𝑛 - 1}{𝑘}) (𝑛 - 𝑘)! .

Proof. Let $𝑃_{𝑖}$ be the property that $𝑖 (𝑖 + 1)$ occurs in the permutation and $𝐴_{𝑖}$ be the set of permutations with this property.

We have to know $| 𝐴_{𝑖_{1}} \cap \dots \cap 𝐴_{𝑖_{𝑘}}$ for $1 \leq 𝑖_{1} < \dots < 𝑖_{𝑘} \leq 𝑛 - 1$ . Notice that, unlike previous situations, the property $𝑃_{𝑖}$ involves two positions. Assume $𝑖_{1} 𝑖_{2} \dots 𝑖_{𝑘}$ consists of $𝑚$ consecutive segments, with length $𝑛_{1}, \dots, 𝑛_{𝑚}$ , respectively. $𝑛_{1} + \dots + 𝑛_{𝑚} - 𝑚 = 𝑘$ . Then, there are in total $𝑛 - 𝑘$ remaining elements and $𝑚$ groups free to arrange. The number of permutations in this case is $(𝑛 - 𝑘 + (𝑛_{1} + \dots + 𝑛_{𝑚} - 𝑚))! = (𝑛 - 𝑘)!$ , which happens to depend only on $𝑘$ . [TODO: interpretation]

Then,

\begin{array}{l} 𝑄_{𝑛} & = | 𝑆 | + {\sum︁}_{𝑘 = 0}^{𝑛 - 1} (- 1)^{𝑘} {∑︁}_{𝑚 = 1}^{𝑘} \underset{1 \leq 𝑖_{1} < \dots < 𝑖_{𝑘} \leq 𝑛 - 1 : 𝑚 concesutive segments}{∑︁} (𝑛 - 𝑘)! \\ = 𝑛! + {\sum︁}_{𝑘 = 0}^{𝑛 - 1} (- 1)^{𝑘} \underset{1 \leq 𝑖_{1} < \dots < 𝑖_{𝑘} \leq 𝑛 - 1}{∑︁} (𝑛 - 𝑘)! \\ = 𝑛! + {\sum︁}_{𝑘 = 0}^{𝑛 - 1} (- 1)^{𝑘} (\binom{𝑛 - 1}{𝑘}) (𝑛 - 𝑘)! . \end{array}

□

Theorem 5.5. For $𝑛 \geq 2$ ,

𝑄_{𝑛} = 𝐷_{𝑛} + 𝐷_{𝑛 - 1} .

Proof. [TODO: ] □

5.5 Euler totient function

Definition 5.5. The Euler totient function is defined by

𝜑 (𝑛) = # integers from 1 to n coprime to n, \forall 𝑛 \in Z_{+} .

Theorem 5.6. Let $𝑛$ be a positive integer factorized into the form

𝑛 = 𝑝_{1}^{𝑒_{1}} \dots 𝑝_{𝑟}^{𝑒_{𝑟}},

where $𝑝_{1}, \dots, 𝑝_{𝑟}$ are distinct primes with powers $\geq 1$ . Then,

𝜑 (𝑛) = 𝑛 \prod_{𝑖 = 1}^{𝑟} (1 - \frac{1}{𝑝_{𝑖}}) .

Proof. Hint: The number of integers from $1$ to $𝑛$ divisible by distinct primes $𝑝_{𝑖_{1}}, \dots, 𝑝_{𝑖_{𝑘}}$ is $\frac{𝑛}{𝑝_{𝑖_{1}} \dots 𝑝_{𝑖_{𝑘}}}$ . Use the Inclusion-Exclusion principle. □

5.6 rook polynomials

6 ○Möbius inversion

Definition 6.1 (locally finite poset). A poset $(𝑋, ≼)$ is called locally finite, if for any $𝑥 ≼ 𝑦 \in 𝑋$ the closed interval $[𝑥, 𝑦] : = {𝑧 \in 𝑋 | 𝑥 ≼ 𝑧 ≼ 𝑦}$ is a finite set.

Definition 6.2. Let $(𝑋, ≼)$ be a locally finite poset. The incidence algebra of $𝑋$ is a tuple $(I (𝑋), *)$ , where

$I (𝑋)$ is the vector space of real-valued functions on the set of nonempty intervals ( ${[𝑥, 𝑦] : 𝑥, 𝑦 \in 𝑋, 𝑥 ≼ 𝑦}$ ). Each member $𝑓$ of $I (𝑋)$ is called an incidence function of the poset $𝑋$ . We assume that every incidence function has a value $0$ at $(𝑥, 𝑦)$ for any $𝑥 ⋠ 𝑦$ .
The product $*$ is the convolution:

$\forall 𝑥 ≼ 𝑦 : (𝛼 * 𝛽) (𝑥, 𝑦) = \underset{𝑧 ∈[𝑥, 𝑦]}{\sum︁} 𝛼 (𝑥, 𝑧) 𝛽 (𝑧, 𝑦) .$

The red color emphasizes that the order is associated with the poset.

P.S. Why the variable in the convolution is taken $\in [𝑥, 𝑦]$ instead of $\in 𝑋$ as usual? Actually, under the assumption that the incidence functions have a value $0$ for empty intervals, the usual convolution

\forall 𝑥, 𝑦 \in 𝑋 : (𝛼 * 𝛽) (𝑥, 𝑦) = \underset{𝑧 \in 𝑋}{\sum︁} 𝛼 (𝑥, 𝑧) 𝛽 (𝑧, 𝑦)

is still well-defined: If $𝑥 ⋠ 𝑦$ , either $𝑥 ⋠ 𝑧$ or $𝑧 ⋠ 𝑦$ . Thus, either $𝛼 (𝑥, 𝑧) = 0$ or $𝛽 (𝑧, 𝑦) = 0$ . Therefore, $(𝛼 * 𝛽) (𝑥, 𝑦) = 0$ .

We can check that

the convolution satisfies the associative law: $𝛼 * (𝛽 * 𝛾) = (𝛼 * 𝛽) * 𝛾$ .

Definition 6.3. An incidence function $𝛿$ is called an identity of the algebra $(I (𝑋), *)$ if

𝛿 * 𝑓 = 𝑓 = 𝑓 * 𝛿 .

There is a unique identity (using Method 6.1), which is the delta function defined by

𝛿 (𝑥, 𝑦) = {\begin{matrix} 1, & if 𝑥 = 𝑦, \\ 0, & if 𝑥 \neq 𝑦, \end{matrix} .

Method 6.1. Given incidence functions $𝑓$ and $ℎ$ and an equation $𝑔 * 𝑓 = ℎ$ , we can solve $𝑔$ inductively as follows:

\begin{matrix} \begin{aligned} 𝑔 (𝑥, 𝑥) & = \frac{ℎ (𝑥, 𝑥)}{𝑓 (𝑥, 𝑥)}, \forall 𝑥 \in 𝑋, \\ 𝑔 (𝑥, 𝑦) & = \frac{1}{𝑓 (𝑦, 𝑦)} [ℎ (𝑥, 𝑦) - \underset{𝑥 ≼ 𝑧 ≺ 𝑦}{\sum︁} 𝑔 (𝑥, 𝑧) 𝑓 (𝑧, 𝑦)], \forall 𝑥 ≺ 𝑦 \end{aligned} \end{matrix}

from “closer” pairs to “further” pairs, as long as

𝑓 (𝑥, 𝑥) \neq 0 \forall 𝑥 \in 𝑋

Definition 6.4. Given an incidence function $𝑓$ ,

an incidence function $𝑔$ is a left inverse of $𝑓$ if $𝑔 * 𝑓 = 𝛿$ ,
an incidence function $𝑔$ is a right inverse of $𝑓$ if $𝑓 * 𝑔 = 𝛿$ ,
$𝑓$ is invertible if it has both a left and right inverse. In this case, the two inverses are identical and unique, called the inverse of $𝑓$ .

Proposition 6.1. An incidence function $𝑓$ is invertible iff $𝑓 (𝑥, 𝑥) \neq 0$ for all $𝑥 \in 𝑋$ (using Method 6.1).

Definition 6.5. The zeta function $𝜁$ of the poset $(𝑋, ≼)$ is the “constant 1 function”, i.e.,

𝜁 (𝑥, 𝑦) = 1, \forall 𝑥 ≼ 𝑦 .

Definition 6.6. The Möbius function $𝜇$ of the poset $(𝑋, ≼)$ is the inverse of the zeta function $𝜁$ ,

𝜇 = 𝜁^{- 1} .

[TODO: locally finite?]

Clearly, $𝜁$ is invertible. Using Method 6.1, we can solve $𝜇$ :

\begin{matrix} \begin{aligned} 𝜇 (𝑥, 𝑥) & = 1, \forall 𝑥 \in 𝑋, \\ 𝜇 (𝑥, 𝑦) & = - \underset{𝑥 ≼ 𝑧 ≺ 𝑦}{\sum︁} 𝜇 (𝑥, 𝑧) = - \underset{𝑥 ≺ 𝑧 ≼ 𝑦}{\sum︁} 𝜇 (𝑧, 𝑦), \forall 𝑥 ≺ 𝑦 . \end{aligned} \end{matrix}

Lemma 6.1. Let $𝑃, 𝑄$ be isomorphic posets, that is, there exists a bijection $𝜙 : 𝑃 \to 𝑄$ preserving the partial order:

\forall 𝑥, 𝑦 \in 𝑃 : 𝑥 ≼_{𝑃} 𝑦 ⟺ 𝜙 (𝑥) ≼_{𝑄} 𝜙 (𝑦) .

Then, the induced map $𝜙 : I (𝑃) \to I (𝑄)$ defined by

\forall 𝑓 \in I (𝑃) : [𝜙 (𝑓)] (𝜙 (𝑥), 𝜙 (𝑦)) = 𝑓 (𝑥, 𝑦)

is an algebra isomorphism.

In particular,

𝛿_{𝑄} = 𝜙 (𝛿_{𝑃}), 𝜁_{𝑄} = 𝜙 (𝜁_{𝑃}), 𝜇_{𝑄} = 𝜙 (𝜇_{𝑃}) .

Proof. [TODO: ] □

6.1 Möbius inversion

Given a finite $(𝑋, ≼)$ . Let $𝐹 (𝑋)$ denote the set of functions $: 𝑋 \to R$ . For every $𝑓 \in 𝐹$ , we define the product as

\begin{matrix} \begin{aligned} (𝛼 * 𝑓) (𝑥) = \underset{𝑥 ≼ 𝑦}{\sum︁} 𝛼 (𝑥, 𝑦) 𝑓 (𝑦), \forall 𝑥 \in 𝑋, \\ (𝑓 * 𝛼) (𝑥) = \underset{𝑥 ≼ 𝑦}{\sum︁} 𝑓 (𝑥) 𝛼 (𝑥, 𝑦), \forall 𝑦 \in 𝑋 . \end{aligned} \end{matrix}

Proposition 6.2. Let $(𝑋, ≼)$ be a finite poset, $𝛼 \in I (𝑋)$ be invertible, and $𝑓, 𝑔 \in 𝐹 (𝑋)$ . Then

\begin{matrix} \begin{aligned} 𝑔 = 𝛼 * 𝑓 ⟺ 𝑓 = 𝛼^{- 1} * 𝑔, \\ 𝑔 = 𝑓 * 𝛼 ⟺ 𝑓 = 𝑔 * 𝛼^{- 1} . \end{aligned} \end{matrix}

[TODO: associativity]

Theorem 6.1 (Möbius inversion formula). Let $(𝑋, ≼)$ be a finite poset and $𝑓, 𝑔 \in 𝐹 (𝑋)$ . Then,

\begin{matrix} \begin{aligned} 𝑔 (𝑥) = \underset{𝑥 ≼ 𝑦}{\sum︁} 𝑓 (𝑦), \forall 𝑥 \in 𝑋 ⟺ 𝑓 (𝑥) = \underset{𝑥 ≼ 𝑦}{\sum︁} 𝜇 (𝑥, 𝑦) 𝑔 (𝑦), \forall 𝑥 \in 𝑋; \\ 𝑔 (𝑦) = \underset{𝑥 ≼ 𝑦}{\sum︁} 𝑓 (𝑥), \forall 𝑦 \in 𝑋 ⟺ 𝑓 (𝑦) = \underset{𝑥 ≼ 𝑦}{\sum︁} 𝑔 (𝑥) 𝜇 (𝑥, 𝑦), \forall 𝑦 \in 𝑋 . \end{aligned} \end{matrix}

Proof. The LHSs can be written as $𝑔 = 𝜁 * 𝑓$ and $𝑔 = 𝑓 * 𝜁$ , respectively. Then, apply the previous proposition. □

6.2 product posets

Let $(𝑋_{1}, ≼_{1})$ and $(𝑋_{2}, ≼_{2})$ be two posets. Then the partial order of the product poset $(𝑋_{1} \times 𝑋_{2}, ≼)$ is defined by

(𝑥_{1}, 𝑥_{2}) ≼ (𝑦_{1}, 𝑦_{2}) ⟺ 𝑥_{1} ≼_{1} 𝑥_{2}, 𝑦_{1} ≼_{2} 𝑦_{2} .

TEST

Theorem 6.2. Let $𝜇_{𝑖}$ be the Möbius function of the poset $(𝑋_{𝑖}, ≼_{𝑖})$ for $𝑖 = 1, 2$ . Then the Möbius function $𝜇$ of $(𝑋_{1} \times 𝑋_{2}, ≼)$ is given by

𝜇 ((𝑥_{1}, 𝑥_{2}), (𝑦_{1}, 𝑦_{2})) = 𝜇_{1} (𝑥_{1}, 𝑦_{1}) 𝜇_{2} (𝑥_{2}, 𝑦_{2}) .

Proof. Following Equation (6),

𝜇 ((𝑥_{1}, 𝑥_{2}), (𝑦_{1}, 𝑦_{2})) = - \underset{(𝑥_{1}, 𝑥_{2}) ≼ (𝑧_{1}, 𝑧_{2}) ≺ (𝑦_{1}, 𝑦_{2})}{\sum︁} 𝜇 ((𝑥_{1}, 𝑥_{2}), (𝑧_{1}, 𝑧_{2})) .

We proceed by induction. Note that the length of the longest chain in $(𝑥_{1}, 𝑥_{2}), (𝑧_{1}, 𝑧_{2})$ is strictly smaller than the one of $(𝑥_{1}, 𝑥_{2}), (𝑦_{1}, 𝑦_{2})$ , so we induct on the length.

\begin{array}{l} = - \underset{(𝑥_{1}, 𝑥_{2}) ≼ (𝑧_{1}, 𝑧_{2}) ≺ (𝑦_{1}, 𝑦_{2})}{\sum︁} 𝜇_{1} (𝑥_{1}, 𝑧_{1}) 𝜇_{2} (𝑥_{2}, 𝑧_{2}) \\ = 𝜇_{1} (𝑥_{1}, 𝑦_{1}) 𝜇_{2} (𝑥_{2}, 𝑦_{2}) - \underset{(𝑥_{1}, 𝑥_{2}) ≼ (𝑧_{1}, 𝑧_{2}) ≼ (𝑦_{1}, 𝑦_{2})}{\sum︁} 𝜇_{1} (𝑥_{1}, 𝑧_{1}) 𝜇_{2} (𝑥_{2}, 𝑧_{2}) \end{array}

The sum can be decomposed into a product of two sums because $≼$ comprises two “independent” conditions, $≼_{1}$ and $≼_{2}$ .

\begin{array}{l} = 𝜇_{1} (𝑥_{1}, 𝑦_{1}) 𝜇_{2} (𝑥_{2}, 𝑦_{2}) - \underset{𝑥_{1} ≼_{1} 𝑧_{1} ≼_{1} 𝑦_{1}}{\sum︁} 𝜇_{1} (𝑥_{1}, 𝑧_{1}) \underset{𝑥_{2} ≼_{2} 𝑧_{2} ≼ 𝑦_{2}}{\sum︁} 𝜇_{2} (𝑥_{2}, 𝑧_{2}) \\ = 𝜇_{1} (𝑥_{1}, 𝑦_{1}) 𝜇_{2} (𝑥_{2}, 𝑦_{2}) - (𝜇_{1} * 𝜁) (𝑥_{1}, 𝑦_{1}) \cdot (𝜇_{2} * 𝜁) (𝑥_{2}, 𝑦_{2}) = 𝜇_{1} (𝑥_{1}, 𝑦_{1}) 𝜇_{2} (𝑥_{2}, 𝑦_{2}) . \end{array}

□

6.3 ○[TODO: number theoretical]

Definition 6.7. The poset $(Z_{+}, |)$ with the partial order of divisibility is locally finite. Let $𝜇$ be the Möbius function of $(Z_{+}, |)$ . Then, the Möbius function of number theory is defined by

𝜇 (𝑛) : = 𝜇 (1, 𝑛) .

Proposition 6.3. If $𝑎 | 𝑏$ , then $𝜇 (𝑎, 𝑏) = 𝜇 (1, 𝑏 / 𝑎)$ .

Proof. Using the poset isomorphism $𝑥 \mapsto 𝑥 / 𝑎$ and Lemma 6.1, we have [TODO: ]. □

By Equation 6, we have

Proposition 6.4.

\begin{matrix} \begin{aligned} 𝜇 (1) & = 1, \\ 𝜇 (𝑛) & = - \underset{𝑑 | 𝑛, 𝑑≠𝑛}{\sum︁} 𝜇 (𝑑), 𝑛 \geq 2 . \end{aligned} \end{matrix}

Derivation 6.1. Lemma 6.1

Its Möbius function is partially given by

\begin{array}{l} 𝜇 (𝑚, 𝑚) & = 1, \forall 𝑚 \in Z_{+}, \\ 𝜇 (1, 𝑛) & = - \underset{𝑚 \in Z_{+}, 𝑚 | 𝑛, 𝑚≠𝑛}{\sum︁} 𝜇 (1, 𝑚) \end{array}

Theorem 6.3. The Möbius function of number theory is explicitly given by

𝜇 (𝑛) = {\begin{matrix} 1, & if 𝑛 = 1 \\ (- 1)^{𝑗}, & if 𝑛 is a product of 𝑗 distinct, non-repeated primes \\ 0, & if 𝑛 has repeated prime factors . \end{matrix}

Proposition 6.5 (Möbius inversion, number theory). Given functions $𝑓, 𝑔 : Z_{\pm} \to C$ ,

𝑔 (𝑛) = \underset{𝑑 | 𝑛}{\sum︁} 𝑓 (𝑑), \forall 𝑛 \in Z_{+} ⟺ 𝑓 (𝑛) = \underset{𝑑 | 𝑛}{\sum︁} 𝑔 (𝑑) 𝜇 (\frac{𝑛}{𝑑}), \forall 𝑛 \in Z_{+} .

Proof. $𝑔 = 𝑓 * 𝜁 ⟺ 𝑓 = 𝑔 * 𝜇$ by Möbius inversion 6.1. $𝜇 (𝑑, 𝑛) = 𝜇 (𝑛 / 𝑑)$ . □

Theorem 6.4. The Euler totient function $𝜙$ (Defn. 5.5) and Möbius function $𝜇$ are related by

\frac{𝜑 (𝑛)}{𝑛} = \underset{𝑑 | 𝑛}{\sum︁} \frac{𝜇 (𝑑)}{𝑑}, 𝑛 \in Z_{+} .

Proof. Let $Φ (𝑑, 𝑛) : = {𝑎 \in [𝑛] : gcd (𝑎, 𝑛) = 𝑑}$ . [TODO: $Φ (𝑑, 𝑛) \to Φ (1, 𝑛 / 𝑑)$ is a bijection] $[𝑛] = \sum_{𝑑 | 𝑛} Φ (𝑑, 𝑛)$ . Thus, $𝑛 = \sum_{𝑑 | 𝑛} | Φ (1, 𝑛 / 𝑑) | = 𝜑 * 𝜁 ⟺ 𝜑 = 𝑛 * 𝜇$ . □

6.4 ○circular permutations with repetition

Definition 6.8. Let $𝑀 = {\infty 𝑎_{1}, \dots, \infty 𝑎_{𝑘}}$ , where $𝑘 \geq 2$ , be a multiset of $𝑘$ distinct elements. A circular $𝑛$ -permutation of $𝑀$ is an arrangement of $𝑛$ elements of $𝑀$ on a circle.

Theorem 6.5. The number of circular $𝑛$ -permutations of a set of $𝑘$ distinct objects with repetition allowed is

\frac{1}{𝑛} \underset{𝑑 | 𝑛}{\sum︁} 𝑘^{𝑑} 𝜑 (\frac{𝑛}{𝑑}) = \frac{1}{𝑛} \underset{𝑎 | 𝑛}{\sum︁} 𝑘^{𝑛 / 𝑎} 𝜑 (𝑎) .

Proof. [TODO: ] □

Theorem 6.6. Let $𝑀$ be a multiset of type $(𝑛_{1}, \dots, 𝑛_{𝑘})$ with $𝑚 = gcd (𝑛_{1}, …, 𝑛_{𝑘})$ and $𝑛 = 𝑛_{1} + \dots + 𝑛_{𝑘}$ . Then the number of circular permutations of $𝑀$ is

\frac{1}{𝑛} \underset{𝑑 | 𝑚}{\sum︁} 𝜑 (𝑑) (\binom{𝑛 / 𝑑}{𝑛_{1} / 𝑑, \dots, 𝑛_{𝑘} / 𝑑}) .

Proof. [TODO: ] □

7 Linear Recurrence

Definition 7.1. A sequence $(𝑥_{𝑛}; 𝑛 \geq 0)$ of numbers is said to satisfy a linear recurrence relation of order $𝑘$ if there exist functions $𝛼$ ’s, with $𝛼_{𝑘} (𝑛) \neq 0$ , and a function $𝛽$ such that

\forall 𝑛 \geq 𝑘 : 𝑥_{𝑛} = 𝛼_{1} (𝑛) 𝑥_{𝑛 - 1} + 𝛼_{2} (𝑛) 𝑥_{𝑛 - 2} + \dots + 𝛼_{𝑘} (𝑛) 𝑥_{𝑛 - 𝑘} + 𝛽 (𝑛) .

homogeneity

The linear recurrence relation in (7.1) is said to be homogeneous if $𝛽 (𝑛) = 0$ for all $𝑛 \geq 𝑘$ and non-homogeneous otherwise.

constant

A linear recurrence relation with constant coefficients has constant $𝛼$ functions (excluding $𝛽$ ).

solution

A solution of the relation is a sequence satisfying the relation. A general solution is a solution with parameters $𝑐_{1}, \dots, 𝑐_{𝑘}$ and functions $𝑢_{𝑛}$ :

𝑥_{𝑛} = 𝑢_{𝑛} (𝑐_{1}, \dots, 𝑐_{𝑘}), 𝑛 \geq 0,

provided that for arbitrary initial values $𝑢_{0}, \dots, 𝑢_{𝑘 - 1}$ there exist constants $𝑐_{1}, \dots, 𝑐_{𝑘}$ such that (7.1) is the (unique) solution satisfying the initial conditions.

P.S. There can be many sequences satisfying the same linear recurrence relation. When we talk about the set of solutions / general solution, we refer to the case where the initial values are unspecified.

On one hand, by definition, every solution has initial values
On the other hand, any set of initial $𝑘$ values uniquely determines a solution.

7.1 ○structural theorems

Notation 1. Throughout this section, the vector notation ( $\vec{𝑎}$ ) will be used to write sequences ( $(𝑎_{𝑛}; 𝑛 \geq 0)$ ).

Let $S_{\infty}$ denote the infinite-dimensional vector space of all sequences under the ordinary addition and scalar multiplication.

Theorem 7.1. Let $H_{𝑘}$ denote the set of solutions of a homogeneous linear recurrence relation. Then, $H_{𝑘}$ is $𝑘$ -dimensional subspace of $S_{\infty}$ .

Consequently, if $\vec{𝑢^{(1)}}, \dots, \vec{𝑢^{(𝑘)}}$ are linearly independent solutions in $H_{𝑘}$ , then all solutions in $H_{𝑘}$ can be written as

\vec{𝑥} = 𝑐_{1} \vec{𝑢^{(1)}} + \dots + 𝑐_{𝑘} \vec{𝑢^{(𝑘)}} .

Proof. It is easy to check that $H_{𝑘}$ is a vector subspace.

To show that $H_{𝑘}$ is $𝑘$ -dimensional, we just need to find an isomorphism between $H_{𝑘}$ and $R^{𝑘}$ . Consider the projection $𝜋 : S_{\infty} \to R^{𝑘}$ defined by $\vec{𝑥} \mapsto (𝑥_{0}, 𝑥_{1}, \dots, 𝑥_{𝑘 - 1})$ . Then, $𝜋 |_{H_{𝑘}}$ is an isomorphism. □

Corollary 7.1. Given a set of specific solutions of a homogeneous linear recurrence relation, any linear combination of them is also a solution.

Theorem 7.2. Let $N_{𝑘}$ denote the set of solutions of a non-homogeneous linear recurrence relation. Let $H_{𝑘}$ be the set of solutions of the corresponding homogeneous linear recurrence. Then, $N_{𝑘}$ is a translate of $H_{𝑘}$ in $S_{\infty}$ , i.e.,

N_{𝑘} = \vec{𝑛} + H_{𝑘} = {\vec{𝑛} + \vec{ℎ} : \vec{ℎ} \in H_{𝑘}} .

(a.k.a. a $𝑘$ -dimensional affine subspace of $S_{\infty}$ )

Geometrical intepretation: not through the origin

Proof. Let $\vec{𝑣} \in N_{𝑘}$ . Show that $\vec{𝑛} - \vec{𝑣} \in H_{𝑘}$ ( $𝛽 (𝑛)$ is cancelled). This proves that $N_{𝑘} \subset \vec{𝑛} + H_{𝑘}$ .

Conversely, show that, for any $\vec{ℎ} \in H_{𝑘}$ , $\vec{𝑛} + \vec{ℎ}$ is a solution. □

Method 7.1. By Theorem 7.2, we can summarize a general framework for finding the solutions to a nonhomogeneous linear recurrence relation:

1.: Find the general solution of the corresponding homogeneous relation (special cases in Sec. 7.2.1).
2.: Find a particular solution of the nonhomogeneous relation (special cases in Sec. 7.2.1)
3.: Sum the two to derive the general solution of the nonhomogeneous relation. Determine values of the constants (parameters) that satisfy the initial conditions.

Definition 7.2. Let $(𝑢_{𝑛}^{(1)}), \dots, (𝑢_{𝑛}^{(𝑘)})$ be $𝑘$ solutions in $H_{𝑘}$ . Then the Wronskian $𝑊_{𝑘} (𝑛)$ of the solutions is defined as

𝑊_{𝑘} (𝑛) : = det [\begin{matrix} 𝑢_{𝑛}^{(1)} & \dots & 𝑢_{𝑛}^{(𝑘)} \\ ⋮ & ⋱ & ⋮ \\ 𝑢_{𝑛 + 𝑘 - 1}^{(1)} & \dots & 𝑢_{𝑛 + 𝑘 - 1}^{(𝑘)} \end{matrix}],

i.e., the window of $𝑘$ items of the solutions starting from position $𝑛$ .

7.2 constant coefficients

Derivation 7.1. Consider linear recurrence relation of order 1: $𝑥_{𝑛} = 𝑞 𝑥_{𝑛 - 1}, 𝑛 \geq 1$ . It is heuristic that the solution of it is a geometric sequence, $𝑥_{𝑛} = 𝑞^{𝑛}$ . This hints that higher order relation with constant coefficients may have some solutions of the same form (not unique).

7.2.1 homogeneous

In this section, we consider homogeneous linear recurrence relations with constant coefficients.

Definition 7.3. Let $(𝑥_{𝑛}; 𝑛 \geq 0)$ be a homogeneous linear recurrence relation of order $𝑘$ with constant coefficients:

𝑥_{𝑛} = 𝛼_{1} 𝑥_{𝑛 - 1} + \dots + 𝛼_{𝑘} 𝑥_{𝑛 - 𝑘}, 𝛼_{𝑘} \neq 0 .

Its characteristic equation is formed by substituting $𝑥_{𝑛 - 𝑖} = 𝑟^{𝑘 - 𝑖}$ , defined as

𝑟^{𝑘} - 𝛼_{1} 𝑟^{𝑘 - 1} - \dots - 𝛼_{𝑘} = 0 .

Its characteristic polynomial is defined as

𝑃 (𝑟) = 𝑟^{𝑘} - 𝛼_{1} 𝑟^{𝑘 - 1} - \dots - 𝛼_{𝑘} .

Theorem 7.3.

(a)

Let

𝑞 \neq 0

. The sequence

𝑥_{𝑛} = 𝑞^{𝑛}, 𝑛 \geq 0

is a solution of (7.3) iff

𝑞

is a root of the characteristic equation (7.3).

(b)

Suppose the nonzero roots of characteristic equation (7.3) are

𝑘

distinct values

𝑞_{1}, \dots, 𝑞_{𝑘}

, then the general solution of (7.3) is given by

𝑥_{𝑛} = 𝑐_{1} 𝑞_{1}^{𝑛} + \dots + 𝑐_{𝑘} 𝑞_{𝑘}^{𝑛}, 𝑛 \geq 0 .

Proof.

(a): Easy to check (Note that $𝑞 \neq 0$ is important).
(b): Check the condition in the definition: [TODO: ]

□

Example 7.1. Fibonacci sequence: $𝑓_{0} = 0, 𝑓_{1} = 1$ , $𝑓_{𝑛} = 𝑓_{𝑛 - 1} + 𝑓_{𝑛 - 2}$ for $𝑛 \geq 2$ .

The characteristic equation is $𝑟^{2} - 𝑟 - 1 = 0$ , with two roots $\frac{1 \pm \sqrt{5}}{2}$ . By Theorem 7.3 (b), the general solution is

𝑓_{𝑛} = 𝑐_{1} {(\frac{1 + \sqrt{5}}{2})}^{𝑛} + 𝑐_{2} {(\frac{1 - \sqrt{5}}{2})}^{𝑛} .

Plugging this into $𝑓_{0}$ and $𝑓_{1}$ , we can obtain $𝑐_{1} = 1 / \sqrt{5}, 𝑐_{2} = - 1 / \sqrt{5}$ . Thus, the general formula for the Fibonacci sequence is

𝑓_{𝑛} = \frac{1}{\sqrt{5}} {(\frac{1 + \sqrt{5}}{2})}^{𝑛} - \frac{1}{\sqrt{5}} {(\frac{1 - \sqrt{5}}{2})}^{𝑛} .

Definition 7.4. A number $𝑞$ is a root of multiplicity $𝑚$ of a polynomial if it occurs $𝑚$ times in the multiset of roots. In other words, the polynomial can be decomposed as $(𝑟 - 𝑞)^{𝑚} 𝑠 (𝑟)$ where $𝑠 (𝑟)$ is a polynomial with $𝑠 (𝑞) \neq 0$ .

Theorem 7.4.

If $𝑞$ is a nonzero root with multiplicity $𝑚$ of the characteristic equation (7.3), then

$𝑥_{𝑛} = 𝑞^{𝑛}, 𝑛 𝑞^{𝑛}, \dots, 𝑛^{𝑚 - 1} 𝑞^{𝑛}, 𝑛 \geq 0$

are linearly independent solutions for (7.3).

Suppose the nonzero roots of the characteristic equation (7.3) are $𝑠$ distinct values $𝑞_{1}, \dots, 𝑞_{𝑠}$ with the multiplicities $𝑚_{1}, \dots, 𝑚_{𝑠}$ respectively. Then

\begin{aligned} 𝑥_{𝑛} = & 𝑞_{1}^{𝑛}, & 𝑛 𝑞_{1}^{𝑛}, & \dots, & 𝑛^{𝑚_{1} - 1} 𝑞_{1}^{𝑛}, \\ 𝑞_{2}^{𝑛}, & 𝑛 𝑞_{2}^{𝑛}, & \dots, & 𝑛^{𝑚_{2} - 1} 𝑞_{2}^{𝑛}, \\ \dots \\ 𝑞_{𝑠}^{𝑛}, & 𝑛 𝑞_{𝑠}^{𝑛}, & \dots, & 𝑛^{𝑚_{𝑠} - 1} 𝑞_{𝑠}^{𝑛}, 𝑛 \geq 0 \end{aligned}

are linearly independent solutions of (7.3). Morever, their linear combinations form all solutions of (7.3).

Proof.

(a)

Since

𝑛^{𝑗}

can be represented as a linear combination of falling factorial polynomial

(𝑛)_{𝑗}

’s (Definition 9.8, Theorem 9.5), and that linear combinations of solutions are also solutions (the solution space is a vector space), it suffices to prove the following claim:

For every $0 \leq 𝑗 \leq 𝑚 - 1$ , $𝑥_{𝑛} = (𝑛)_{𝑗} 𝑞^{𝑛}$ is a solution.

Plugging $𝑥_{𝑛} = (𝑛)_{𝑗} 𝑞^{𝑛}$ into

𝑥_{𝑛} - 𝛼_{1} 𝑥_{𝑛 - 1} - \dots - 𝛼_{𝑘} 𝑥_{𝑛 - 𝑘} : = 𝑅 (𝑞),

we have (Hint, the coefficients here do not matter much)

𝑅 (𝑞) = - {\sum︁}_{𝑖 = 0}^{𝑘} 𝛼_{𝑖} (𝑛 - 𝑖)_{𝑗} 𝑞^{𝑛} .

Note that $(𝑞^{𝑛})^{(𝑗)} = (𝑛)_{𝑗} 𝑞^{𝑛 - 𝑗}$ for $𝑛 \geq 𝑗$ . If we lift the degree of $𝑃 (𝑟)$ by multiplying it with $𝑟^{𝑛 - 𝑘}$ , we have

(𝑟^{𝑛 - 𝑘} 𝑃 (𝑟))^{(𝑗)} = - {∑︁}_{𝑖 = 0}^{𝑘} 𝛼_{𝑖} (𝑛 - 𝑖)_{𝑗} 𝑞^{𝑛 - 𝑗} .

Thus, $𝑅 (𝑞) = 𝑞^{𝑗} (𝑟^{𝑛 - 𝑘} 𝑃 (𝑟))^{(𝑗)} |_{𝑟 = 𝑞}$ .

Since the root $𝑞$ has multiplicity $𝑚$ , we can write the characteristic polynomial (7.3) as $𝑃 (𝑟) = (𝑟 - 𝑞)^{𝑚} 𝑄 (𝑟)$ for some polynomial $𝑄 (𝑟)$ of degree $𝑘 - 𝑚$ . Then,

\begin{array}{l} 𝑅 (𝑞) & = 𝑞^{𝑗} [𝑟^{𝑛 - 𝑘} (𝑟 - 𝑞)^{𝑚} 𝑄 (𝑟)]^{(𝑗)} |_{𝑟 = 𝑞} \\ = 𝑞^{𝑗} {∑︁}_{𝑖 = 0}^{𝑗} (\binom{𝑗}{𝑖}) [(𝑟 - 𝑞)^{𝑚}]^{(𝑖)} |_{𝑟 = 𝑞} [𝑟^{𝑛 - 𝑘} 𝑄 (𝑟)]^{(𝑗 - 𝑖)} |_{𝑟 = 𝑞} \\ = 𝑞^{𝑗} {∑︁}_{𝑖 = 0}^{𝑗} (\binom{𝑗}{𝑖}) (𝑚)_{𝑖} (𝑟 - 𝑞)^{𝑚 - 𝑖} |_{𝑟 = 𝑞} [𝑟^{𝑛 - 𝑘} 𝑄 (𝑟)]^{(𝑗 - 𝑖)} |_{𝑟 = 𝑞} = 0 . \end{array}

(b)

[TODO: ]

□

7.2.2 nonhomogeneous

In this section, we consider nonhomogeneous linear recurrence relations with constant coefficients.

Theorem 7.5 (coefficient variation methods). Given a recurrence relation of the second order of the form

𝑥_{𝑛} = 𝛼_{1} 𝑥_{𝑛 - 1} + 𝛼_{2} 𝑥_{𝑛 - 2} + 𝑐 𝑞^{𝑛},

where $𝛼_{1}, 𝛼_{2}, 𝑐, 𝑞$ are constants. Let $𝑞_{1}, 𝑞_{2}$ be roots of its characteristic euqtion $𝑥^{2} - 𝛼_{1} 𝑥 - 𝛼_{2} = 0$ . Then (7.5) has a particular solution of the following forms, where $𝐴$ is a constant to be determined:

(a): If $𝑞 \neq 𝑞_{1}, 𝑞 \neq 𝑞_{2}$ , then $𝑥_{𝑛} = 𝐴 𝑞^{𝑛}$ .
(b): If $𝑞 = 𝑞_{1}, 𝑞 \neq 𝑞_{2}$ , then $𝑥_{𝑛} = 𝐴𝑛 𝑞^{𝑛}$ .
(c): If $𝑞 = 𝑞_{1} = 𝑞_{2}$ , then $𝑥_{𝑛} = 𝐴 𝑛^{2} 𝑞^{𝑛}$ .

Theorem 7.6 (not required). Given a recurrence relation of the second order

𝑥_{𝑛} = 𝛼_{1} 𝑥_{𝑛 - 1} + 𝛼_{2} 𝑥_{𝑛 - 2} + 𝛽 (𝑛), 𝑛 \geq 2,

where $𝛽 (𝑛)$ is a polynomial with degree $𝑘$ .

If $𝛼_{1} + 𝛼_{2} \neq 1$ , then there is a particular solution of the form
$𝑥_{𝑛} = {\sum︁}_{𝑖 = 0}^{𝑘} 𝐴_{𝑖} 𝑛^{𝑖},$

where $𝐴_{𝑖}$ ’s are constants to be determined. If $𝑘 \leq 2$ , then a particular solution has the form
$𝑥_{𝑛} = 𝐴_{0} + 𝐴_{1} 𝑛 + 𝐴_{2} 𝑛^{2} .$

If $𝛼_{1} + 𝛼_{2} = 1$ , then the relation can be reduced to a first order recurrence relation

𝑦_{𝑛} = (𝛼_{1} - 1) 𝑦_{𝑛 - 1} + 𝛽 (𝑛), 𝑛 \geq 2,

where $𝑦_{𝑛} = 𝑥_{𝑛} - 𝑥_{𝑛 - 1}$ .

Example 7.2. Solve $𝑎_{𝑛} = 4 𝑎_{𝑛 - 1} - 4 𝑎_{𝑛 - 2} + 3 𝑛 + 1$ with $𝑎_{0} = 1, 𝑎_{1} = 2$ .

Since $4 + (- 4) = 0 \neq 1$ , we can apply case (a). $3 𝑛 + 1$ has degree 1, so a particular solution is $𝑎_{𝑛} = 𝐴_{0} + 𝐴_{1} 𝑛$ . We can solve that $𝐴_{1} = 3, 𝐴_{0} = 13$ .

The corresponding homogeneous relation has the general solution $𝑐_{0} 2^{𝑛} + 𝑐_{1} 𝑛 2^{𝑛}$ .

The general solution of the nonhomogenous relation is then $𝑎_{𝑛} = 13 + 3 𝑛 + 𝑐_{0} 2^{𝑛} + 𝑐_{1} 𝑛 2^{𝑛}$ . Plugging this into the intial conditions, we can obtain $𝑐_{0} = - 12, 𝑐_{1} = 5$ .

8 ○Generating Functions

8.1 ordinary generating function

Definition 8.1. The ordinary generating function of an infinite sequence $𝑎_{0}, 𝑎_{1}, \dots$ is the power series

𝐴 (𝑥; 𝑎_{𝑛}) = {\sum︁}_{𝑖 = 0}^{\infty} 𝑎_{𝑖} 𝑥^{𝑖} .

The power series is said to have a rational form if, within its domain of convergence, it can be written as a quotient of two polynomials.

Remark 2 (Useful series).

$\sum_{𝑛 = 0}^{\infty} 𝑥^{𝑛} = \frac{1}{1 - 𝑥}$ , $| 𝑥 | < 1$ .
$\sum_{𝑛 = 0}^{\infty} 𝑐 𝑎^{𝑛} 𝑥^{𝑛} = \frac{𝑐}{1 - 𝑎𝑥}$ , $| 𝑥 | < \frac{1}{| 𝑎 |}$ .
$\frac{1}{(1 - 𝑥)^{𝑘}} = \sum_{𝑛 = 0}^{\infty} (\binom{- 𝑘}{𝑛}) (- 1)^{𝑛} 𝑥^{𝑛} = \sum_{𝑛 = 0}^{\infty} (\binom{𝑛 - 𝑘 + 1}{𝑛}) 𝑥^{𝑛} = \sum_{𝑛 = 0}^{\infty} ⟨ \binom{𝑘}{𝑛} ⟩ 𝑥^{𝑛}$ (Theorem 3.9 and Corollary 3.4).

Theorem 8.1. Let $𝑚$ be a positive integer. Let $𝑎_{𝑛}$ be the number of ways to distribute $𝑛$ indistinguishable balls into $𝑚$ distinct boxes. Then,

{\sum︁}_{𝑛 = 0}^{\infty} 𝑎_{𝑛} 𝑥^{𝑛} = {({∑︁}_{𝑛 = 0}^{\infty} 𝑥^{𝑛})}^{𝑚} = \frac{1}{(1 - 𝑥)^{𝑚}}

An algebraic proof is already provided in Remark 2.

combinatorial proof. The meaning of the LHS is clear.

Consider the expansion of the RHS without merging like terms. Every term is of the form
$1 \cdot 𝑥^{𝑛_{1} + \dots + 𝑛_{𝑚}} = 𝑥^{𝑛_{1}} \dots 𝑥^{𝑛_{𝑚}},$

formed by multiplying $𝑥^{𝑛_{𝑖}}$ ’s from the corresponding series. We view this as “1 way of distributing $(𝑛_{1} + \dots + 𝑛_{𝑚})$ indistinguishable balls into $𝑚$ distinguishable boxes,” achieved by “a conjunction of these $𝑚$ independent events: distributing $𝑛_{𝑖}$ balls into $𝑖$ -th box”.

Then, we collect the like terms. For every fixed $𝑛$ ,

𝑎_{𝑛} 𝑥^{𝑛} = \underset{𝑛_{1} + \dots + 𝑛_{𝑚} = 𝑛}{∑︁} 1 \cdot 𝑥^{𝑛} .

The addition here is viewed as a “disjunction of different kinds of ways to distribute $𝑛$ balls”.

This precisely describe the process of enumerating all the ways to distribute the balls. □

Theorem 8.2 (generalized). Let $𝑎_{𝑛}$ be the number of ways to distribute $𝑛$ indistinguishable balls into $𝑚$ distinguishable boxes, such that the number of balls held in the $𝑘$ -th box is allowed in a subset $𝐶_{𝑘} \subset Z_{\geq 0}$ . Then,

{\sum︁}_{𝑛 = 0}^{\infty} 𝑎_{𝑛} 𝑥^{𝑛} = ({∑︁}_{𝑛 \in 𝐶_{1}}^{\infty} 𝑥^{𝑛}) \dots ({∑︁}_{𝑛 \in 𝐶_{𝑚}}^{\infty} 𝑥^{𝑛}) .

Example 8.1. Let multiset $𝑀 = {\infty \cdot 𝑥_{1}, \infty \cdot 𝑥_{2}, \infty \cdot 𝑥_{3}, \infty \cdot 𝑥_{4}}$ . What is the generating function for $(𝑎_{𝑛}; 𝑛 \geq 0)$ , where $𝑎_{𝑛}$ is the number of $𝑛$ -combinations of $𝑀$ satisfying the following conditions? (1) $𝑥_{1}$ occurs $1$ , $3$ , or $11$ times, (2) $𝑥_{2}$ occurs at least $10$ times, (3) $𝑥_{3}$ occurs a multiple of 3 number of times.

$(𝑥^{1} + 𝑥^{3} + 𝑥^{11}) (\sum_{𝑖 = 10}^{\infty} 𝑥^{𝑖}) (\sum_{𝑖 = 0}^{\infty} 𝑥^{3 𝑖}) (\sum_{𝑖 = 0}^{\infty} 𝑥^{𝑖}) = \frac{(𝑥^{1} + 𝑥^{3} + 𝑥^{11}) 𝑥^{10}}{(1 - 𝑥)^{2} (1 - 𝑥^{3})}$ .

8.2 exponential generating function

Definition 8.2. The exponential generating function of an infinite sequence $𝑎_{0}, 𝑎_{1}, \dots$ is the power series

𝐸 (𝑥; 𝑎_{𝑛}) = {\sum︁}_{𝑛 = 0}^{\infty} \frac{𝑎_{𝑛}}{𝑛!} 𝑥^{𝑛} .

The power series is said to have a rational form if, within its domain of convergence, it can be written as a quotient of two polynomials.

Remark 3 (Useful series).

$\sum_{𝑛 = 0}^{\infty} \frac{𝑥^{𝑛}}{𝑛!} = 𝑒^{𝑥}$

Theorem 8.3. Let $𝑚$ be a positive integer. Let $𝑎_{𝑛}$ be the number of ways to distribute $𝑛$ distinct balls into $𝑚$ distinct boxes. Then,

{\sum︁}_{𝑛 = 0}^{\infty} 𝑎_{𝑛} \frac{𝑦^{𝑛}}{𝑛!} = {({\sum︁}_{𝑛 = 0}^{\infty} \frac{𝑦^{𝑛}}{𝑛!})}^{𝑘} = 𝑒^{𝑘𝑥}

Proof. This is related to Combination with distinguishable boxes 3.7. Easy to prove algebraically. □

Theorem 8.4 (generalized). Let $𝑎_{𝑛}$ be the number of ways to distribute $𝑛$ distinct balls into $𝑚$ distinct boxes, such that the number of balls held in the $𝑘$ -th box is allowed in a subset $𝐶_{𝑘} \subset Z_{\geq 0}$ . Then,

{\sum︁}_{𝑛 = 0}^{\infty} 𝑎_{𝑛} \frac{𝑥^{𝑛}}{𝑛!} = ({\sum︁}_{𝑛 \in 𝐶_{1}}^{\infty} \frac{𝑥^{𝑛}}{𝑛!}) \dots ({\sum︁}_{𝑛 \in 𝐶_{𝑚}}^{\infty} \frac{𝑥^{𝑛}}{𝑛!}) .

Example 8.2. Determine the number of ways of $𝑛$ -permutations of a multiset $𝑀 = {∞𝑎, ∞𝑏, ∞𝑐}$ such that the number of $𝑎$ is even and the number of $𝑏$ is at least one.

\begin{array}{l} 𝐸 (𝑥) & = ({\sum︁}_{𝑖 = 0}^{\infty} \frac{𝑥^{2 𝑖}}{(2 𝑖)!}) ({\sum︁}_{𝑖 = 1}^{\infty} \frac{𝑥^{𝑖}}{𝑖!}) ({\sum︁}_{𝑖 = 0}^{\infty} \frac{𝑥^{𝑖}}{𝑖!}) \\ = \frac{𝑒^{𝑥} + 𝑒^{- 𝑥}}{2} (𝑒^{𝑥} - 1) 𝑒^{𝑥} = \frac{1}{2} (𝑒^{3 𝑥} - 𝑒^{2 𝑥} + 𝑒^{𝑥} - 1) \\ = - \frac{1}{2} + {\sum︁}_{𝑖 = 0}^{\infty} \frac{3^{𝑛} - 2^{𝑛} + 1}{2} \cdot \frac{𝑥^{𝑛}}{𝑛!} . \end{array}

Thus,

𝑎_{𝑛} = \frac{3^{𝑛} - 2^{𝑛} + 1}{2}

for

𝑛 \geq 1

and

𝑎_{0} = - \frac{1}{2} + \frac{3^{0} - 2^{0} + 1}{2} = 0

8.3 connection with linear recurrence

Method 8.1. Suppose we have a recurrence relation $𝑎_{𝑛}$ of degree $𝑘$ . To solve it,

1.

We can write the generating function of

𝑎_{𝑛}

𝐴 (𝑥) = first 𝑘 terms + {\sum︁}_{𝑛 = 𝑘}^{\infty} (recurrence relation for 𝑎_{𝑛}) 𝑥^{𝑛} .

2.

Since the relation is linear, the series can be further written in terms of

𝐴 (𝑥)

(and minor terms).

𝐴 (𝑥) = first 𝑘 terms + some polynomial \cdot 𝐴 (𝑥) + some polynomial .

This gives a formula of $𝐴 (𝑥)$ as a quotient of two polynomials. (E.g., Theorem 8.5)

3.

Try to represent

𝐴 (𝑥)

as a linear combination of power series (E.g., Remark 2 and Proposition 8.1) so that we can have the explicit formula of

𝑎_{𝑛}

Theorem 8.5.

Let $(𝑎_{𝑛}; 𝑛 \geq 0)$ be a sequence satisfying the homogeneous linear recurrence relation of order $𝑘$ with constant coefficients (7.3). Then its generating function has a rational form, i.e.,

𝐴 (𝑥; 𝑎_{𝑛}) = \frac{𝑃 (𝑥)}{𝑄 (𝑥)},

where $𝑄 (𝑥)$ is a polynomial of degree $𝑘$ with a nonzero constant term and $𝑃 (𝑥)$ is a polynomial of degree strictly less than $𝑘$ .

See the explicit formula in the proof.

Conversely, given such polynomials $𝑃, 𝑄$ , then $\frac{𝑃 (𝑥)}{𝑄 (𝑥)}$ is a generating function of a sequence satisfying the above linear recurrence relation.

Proof.

Let $𝑃_{1} (𝑥) : = \sum_{𝑛 = 0}^{𝑘 - 1} 𝑎_{𝑛} 𝑥^{𝑛}$ .
$\begin{array}{l} 𝐴 (𝑥; 𝑎_{𝑛}) & = 𝑃_{1} (𝑥) + {\sum︁}_{𝑛 = 𝑘}^{\infty} 𝑎_{𝑛} 𝑥^{𝑛} = 𝑃_{1} (𝑥) + {\sum︁}_{𝑛 = 𝑘}^{\infty} ({∑︁}_{𝑖 = 1}^{𝑘} 𝛼_{𝑖} 𝑎_{𝑛 - 𝑖}) 𝑥^{𝑛} \\ = 𝑃_{1} (𝑥) + {\sum︁}_{𝑛 = 𝑘}^{\infty} ({∑︁}_{𝑖 = 1}^{𝑘} 𝛼_{𝑖} 𝑥^{𝑖} 𝑎_{𝑛 - 𝑖} 𝑥^{𝑛 - 𝑖}) \\ = 𝑃_{1} (𝑥) + {\sum︁}_{𝑖 = 1}^{𝑘} 𝛼_{𝑖} 𝑥^{𝑖} {∑︁}_{𝑛 = 𝑘}^{\infty} 𝑎_{𝑛} 𝑥^{𝑛} \\ = 𝑃_{1} (𝑥) + ({\sum︁}_{𝑖 = 1}^{𝑘} 𝛼_{𝑖} 𝑥^{𝑖}) 𝐴 (𝑥) - {∑︁}_{𝑖 = 1}^{𝑘} 𝛼_{𝑖} {\sum︁}_{𝑛 = 0}^{𝑘 - 𝑖 - 1} 𝑎_{𝑛} 𝑥^{𝑛 + 𝑖} \\ = 𝑃_{1} (𝑥) + ({\sum︁}_{𝑖 = 1}^{𝑘} 𝛼_{𝑖} 𝑥^{𝑖}) 𝐴 (𝑥) - {∑︁}_{𝑗 = 1}^{𝑘 - 1} 𝑥^{𝑗} {∑︁}_{𝑖 = 1}^{𝑗} 𝛼_{𝑖} 𝑎_{𝑗 - 𝑖} . \end{array}$
Thus,
$(1 - {\sum︁}_{𝑖 = 1}^{𝑘} 𝛼_{𝑖} 𝑥^{𝑖}) 𝐴 (𝑥) = 𝑎_{0} + {\sum︁}_{𝑗 = 1}^{𝑘 - 1} 𝑥^{𝑗} (𝑎_{𝑗} - {\sum︁}_{𝑖 = 1}^{𝑗} 𝛼_{𝑖} 𝑎_{𝑗 - 𝑖}) .$

WLOG, we can assume $𝑃 (𝑥) = \sum_{𝑖 = 0}^{𝑘 - 1} 𝑏_{𝑖} 𝑥^{𝑖}$ and $𝑄 (𝑥) = 1 - \sum_{𝑖 = 1}^{𝑘} 𝛼_{𝑖} 𝑥^{𝑖}$ . Let $𝐴 (𝑥) = \sum_{𝑛 = 0}^{\infty} 𝑎_{𝑛} 𝑥^{𝑛}$ be a generating function. Suppose $𝐴 (𝑥; 𝑎_{𝑛}) = \frac{𝑃 (𝑥)}{𝑄 (𝑥)}$ . Then (assuming $𝛼_{𝑖} = 0$ for $𝑖 > 𝑘$ )

1 \cdot {\sum︁}_{𝑛 = 0}^{\infty} 𝑎_{𝑛} 𝑥^{𝑛} - {∑︁}_{𝑛 = 0}^{\infty} ({∑︁}_{𝑖 = 1}^{𝑛} 𝛼_{𝑖} 𝑎_{𝑛 - 𝑖}) 𝑥^{𝑛} = {∑︁}_{𝑖 = 0}^{𝑘 - 1} 𝑏_{𝑖} 𝑥^{𝑖} .

For $𝑛 \geq 𝑘$ , equating the coefficients of $𝑥^{𝑛}$ , we have

𝑎_{𝑛} - {\sum︁}_{𝑖 = 1}^{𝑘} 𝛼_{𝑖} 𝑎_{𝑛 - 𝑖} = 0 .

Therefore, the quotient can indeed be represented by a generating function and the sequence is given by the recurrence relation above.

□

Proposition 8.1 (partial fractions).

(a)

𝑃 (𝑥)

is a polynomial of degree less than

𝑘

, then

\frac{𝑃 (𝑥)}{(1 - 𝑎𝑥)^{𝑘}} = {\sum︁}_{𝑖 = 1}^{𝑘} \frac{𝐴_{𝑖}}{(1 - 𝑎𝑥)^{𝑖}},

where $𝐴_{1}, \dots, 𝐴_{𝑘}$ are constants to be determined.

(b)

𝑃 (𝑥)

is a polynomial of degree less than

𝑝 + 𝑞 + 𝑟

, then

\frac{𝑃 (𝑥)}{(1 - 𝑎𝑥)^{𝑝} (1 - 𝑏𝑥)^{𝑞} (1 - 𝑐𝑥)^{𝑟}} = \frac{𝐴_{1} (𝑥)}{(1 - 𝑎𝑥)^{𝑝}} + \frac{𝐴_{2} (𝑥)}{(1 - 𝑏𝑥)^{𝑞}} + \frac{𝐴_{3} (𝑥)}{(1 - 𝑐𝑥)^{𝑟}},

where $𝐴_{1} (𝑥), 𝐴_{2} (𝑥), 𝐴_{3} (𝑥)$ are polynomials of degree $𝑞 + 𝑟$ , $𝑝 + 𝑟$ , $𝑝 + 𝑞$ , respectively.

No proof.

8.4 Catalan Sequence

Let $𝐶_{𝑛}$ denote the number of ways to divide a labeled convex $(𝑛 + 2)$ -polygon into triangles. We consider a partition (i.e., into distinct sets) of all divisions according to the first node connected with node 1 except node 2.

Suppose $𝑘 > 2$ is the first node connected to node 1, then there must be a segment between node 2 and node $𝑘$ . This leaves a $(𝑘 - 1)$ polygon and an $(𝑛 + 4 - 𝑘)$ polygon free to divide.

Figure 6: (n+2)-polygon

The number of divisions under this category is $𝐶_{𝑘 - 3} 𝐶_{𝑛 + 2 - 𝑘}$ .

We thus obtain the recurrence relation

𝐶_{𝑛} = {\sum︁}_{𝑘 = 3}^{𝑛 + 2} 𝐶_{𝑘 - 3} 𝐶_{𝑛 + 2 - 𝑘} = {\sum︁}_{𝑘 = 0}^{𝑛 - 1} 𝐶_{𝑘} 𝐶_{𝑛 - 1 - 𝑘} .

Here, we define $𝐶_{0} = 1$ (the triangle collapse to a segment for a 2-polygon).

Consider the generating function $𝐹 (𝑥) = \sum_{𝑛 = 0}^{\infty} 𝐶_{𝑛} 𝑥^{𝑛}$ .

9 Special Counting Sequences

9.1 ○Catalan Sequence

Definition 9.1. The Catalan¹ sequence is

𝐶_{𝑛} = \frac{1}{𝑛 + 1} (\binom{2 𝑛}{𝑛}), 𝑛 \geq 0 .

The $𝑛$ -th term $𝐶_{𝑛}$ is called the $𝑛$ -th Catalan number.

Theorem 9.1. $𝐶_{𝑛}$ is the number of lattice paths along the edges of an $𝑛 \times 𝑛$ grid ( $𝑛 \times 𝑛$ square cells and $(𝑛 + 1)^{2}$ vertices) from $(0, 0)$ to $(𝑛, 𝑛)$ , which do not pass above the diagonal.

Proof. Any lattice path from $(0, 0)$ to $(𝑛, 𝑛)$ have exactly $𝑛$ right moves and $𝑛$ up moves.

Let call a path “bad” if it passes above the diagonal. This happens iff it touches the next higher diagonal (red). Consider the first contact point of a bad path and the next higher diagonal. The portion of the path before (including) this point (black solid) has one more up move than right moves, and the portion after (including) it (red dotted) has one more right move than up moves. If we flip the latter portion about the next higher diagonal (red solid), the entire path has the same number of total moves, but with two more up moves than right moves. In conclusion, if a path is bad then it has $(𝑛 - 1)$ right moves and $(𝑛 + 1)$ up moves.

Conversely, if a path has $(𝑛 - 1)$ right moves and $(𝑛 + 1)$ up moves then it ends up at $(𝑛 - 1, 𝑛 + 1)$ . To reach this ending point, the path must cross the main diagonal. Therefore, having $(𝑛 - 1)$ right moves and $(𝑛 + 1)$ up moves is the exact condition for the path to be bad.

Thus, $𝐶_{𝑛} = (\binom{2 𝑛}{𝑛}) - (\binom{2 𝑛}{𝑛 - 1}) = \frac{1}{𝑛 + 1} (\binom{2 𝑛}{𝑛})$ .

[TODO: How to think of this.]

[TODO: Interpreation of the factor

\frac{1}{𝑛 + 1}

]

□

Theorem 9.2. $𝐶_{𝑛}$ is the number of words $𝑎_{1} 𝑎_{2} \dots 𝑎_{2 𝑛}$ of length $2 𝑛$ consisting of exactly $𝑛$ positive ones and exactly $𝑛$ negative ones and satisfying

𝑎_{1} + \dots + 𝑎_{𝑖} \geq 0, \forall 𝑖 \leq 2 𝑛 .

Proof. Similar. □

9.1.1 [TODO: A geometric example]

9.2 Difference sequences

Definition 9.2. The first order difference sequence of a sequence $(𝑎_{𝑛}; 𝑛 \geq 0)$ defined as

Δ 𝑎_{𝑛} : = (Δ 𝑎)_{𝑛} = 𝑎_{𝑛 + 1} - 𝑎_{𝑛}, 𝑛 \geq 0 .

The $𝑝$ -th order difference sequence $Δ^{𝑝} 𝑎_{𝑛}$ is the difference sequence of the sequence $Δ^{𝑝 - 1} 𝑎_{𝑛}$ .

Viewing sequences as functions $: Z_{\geq 0} \to C$ , we can represent the difference sequence in the following way.

Definition 9.3. Let $S_{\infty}$ denote the vector space of functions $𝑓 : Z_{\geq 0} \to C$ with the ordinary addition and scalar multiplication. The difference operator $Δ : S_{\infty} \to S_{\infty}$ is defined by

(Δ 𝑓) (𝑛) = 𝑓 (𝑛 + 1) - 𝑓 (𝑛), 𝑛 \geq 0 .

Proposition 9.1. The difference operator $Δ : S_{\infty} \to S_{\infty}$ is a linear map.

Definition 9.4. The difference table of a sequence $𝑓 (𝑛)$ is the array

\begin{matrix} 0 : & 𝑓 (0) & 𝑓 (1) & 𝑓 (2) & 𝑓 (3) & \dots \\ 1 : & (Δ 𝑓) (0) & (Δ 𝑓) (1) & (Δ 𝑓) (2) & ⋱ \\ 2 : & (Δ^{2} 𝑓) (0) & (Δ^{2} 𝑓) (1) & ⋱ \\ 3 : & (Δ^{3} 𝑓) (0) & ⋱ \\ ⋱ \end{matrix}

Proposition 9.2. Let $𝑓 (𝑛), 𝑛 \geq 0$ and $𝑔 (𝑛), 𝑛 \geq 0$ be two sequences. If $(Δ^{𝑝} 𝑓) (0) = (Δ^{𝑝} 𝑔) (0)$ for all $𝑝 \geq 0$ , then $𝑓 (𝑛) \equiv 𝑔 (𝑛)$ .

Proof: trivial.

Theorem 9.3. For every $𝑓 \in S_{\infty}$ , the $𝑝$ -th order difference sequence $Δ^{𝑝} 𝑓$ has the form

(Δ^{𝑝} 𝑓) (𝑛) = {∑︁}_{𝑘 = 0}^{𝑝} (- 1)^{𝑝 - 𝑘} (\binom{𝑝}{𝑘}) 𝑓 (𝑛 + 𝑘), 𝑛 \geq 0 .

Proof. Observe that $(Δ^{𝑝} 𝑓) (𝑛)$ can be represented a linear combination 0-th order terms $𝑓 (𝑛), 𝑓 (𝑛 + 1), \dots, 𝑓 (𝑛 + 𝑝)$ . Specifically, every “computation path” from $𝑓 (𝑛 + 𝑘)$ to $(Δ^{𝑝} 𝑓) (𝑛)$ contribute $\pm 𝑓 (𝑛 + 𝑘)$ to the sum. E.g.,

Figure 8: Two paths from $𝑓 (2)$ to $(Δ^{3} 𝑓) (0)$

The coefficient is determined by the number of “right moves”, which flip the sign. For $𝑓 (𝑛 + 𝑘)$ , it takes exactly $𝑝$ moves, of which $𝑖$ moves are to the left and $(𝑝 - 𝑘)$ moves are to the right. Thus, the final coefficient for $𝑓 (𝑛 + 𝑘)$ is $(\binom{𝑝}{𝑘}) (- 1)^{𝑝 - 𝑘}$ . □

Theorem 9.4. The difference table of a sequence $𝑓 (𝑛)$ can be determined by its $0$ -th diagonal sequence $(Δ^{0} 𝑓) (0), (Δ^{1} 𝑓) (0), (Δ^{2} 𝑓) (0), \dots$ . Specially, the sequence itself can be determined by

𝑓 (𝑛) = {\sum︁}_{𝑘 = 0}^{𝑛} (\binom{𝑛}{𝑘}) (Δ^{𝑘} 𝑓) (0) = {∑︁}_{𝑘 = 0}^{\infty} (\binom{𝑛}{𝑘}) (Δ^{𝑘} 𝑓) (0), 𝑛 \geq 0 .

This is an analogy with Taylor expansion.

Proof. Similar combinatorial explanation:

□

Corollary 9.1. Let $𝑓 (𝑛), 𝑛 \geq 0$ be a sequence. Its partial sum can be written as

{\sum︁}_{𝑘 = 0}^{𝑛} 𝑓 (𝑘) = {∑︁}_{𝑘 = 0}^{𝑛} (\binom{𝑛 + 1}{𝑘 + 1}) (Δ^{𝑘} 𝑓) (0), 𝑛 \geq 0 .

Proof. Use $\sum_{𝑘 = 𝑖}^{𝑛} (\binom{𝑘}{𝑖}) = (\binom{𝑛 + 1}{𝑘 + 1})$ . □

Definition 9.5. A sequence $𝑓 (𝑛), 𝑛 \geq 0$ is called a polynomial sequence of degree $𝑝$ if it has the form

𝑓 (𝑛) = 𝛼_{𝑝} 𝑛^{𝑝} + 𝛼_{𝑝 - 1} 𝑛^{𝑝 - 1} + \dots + 𝛼_{0}, 𝑛 \geq 0,

where $𝛼$ ’s are constants and $𝛼_{𝑝} \neq 0$ .

Lemma 9.1. For every polynomial sequence $𝑓 (𝑛), 𝑛 \geq 0$ of degree $𝑝$ , the $(𝑝 + 1)$ -th order difference sequence $Δ^{𝑝 + 1} 𝑓 \equiv 0$ .

Proof. The 1st order difference sequence of a polynomial sequence of degree $𝑝$ is a polynomial sequence of degree at most $𝑝 - 1$ : This is because $(𝑛 + 1)^{𝑘} - 𝑛^{𝑘}$ is a polynomial of degree $𝑘 - 1$ .

Specially, the 1st order difference sequence of a polynomial sequence of degree $0$ is a zero sequence: $𝛼_{0} [(𝑛 + 1)^{0} - 𝑛^{0}] = 0$ . □

Corollary 9.2. Let $𝑓 (𝑛) = 𝑛^{𝑝}, 𝑛 \geq 0$ , where $𝑝$ is a constant. Then,

{\sum︁}_{𝑘 = 0}^{𝑛} 𝑘^{𝑝} = {∑︁}_{𝑘 = 0}^{𝑝} (\binom{𝑛 + 1}{𝑘 + 1}) (Δ^{𝑘} 𝑓) (0),

which is a polynomial of degree $𝑝 + 1$ .

Proof. Discuss $𝑛 \geq 𝑝$ and $𝑛 < 𝑝$ . □

Example 9.1. For $𝑓 (𝑛) = 𝑛^{2}, 𝑛 \geq 0$ and $𝑓 (𝑛) = 𝑛^{3}, 𝑛 \geq 0$ .

\begin{matrix} 0 & 1 & 4 \\ 1 & 3 \\ 2 \end{matrix}, \begin{matrix} 0 & 1 & 8 & 27 \\ 1 & 7 & 19 \\ 6 & 12 \\ 6 \end{matrix} .

Thus, $\sum_{𝑘 = 0}^{𝑛} 𝑘^{2} = (\binom{𝑛 + 1}{2}) + 2 (\binom{𝑛 + 1}{3}) = \frac{(𝑛 + 1) 𝑛 (2 𝑛 + 1)}{6}$ , $\sum_{𝑘 = 0}^{𝑛} 𝑘^{3} = (\binom{𝑛 + 1}{2}) + 6 (\binom{𝑛 + 1}{3}) + 6 (\binom{𝑛 + 1}{4}) = {(\frac{(𝑛 + 1) 𝑛}{2})}^{2}$ .

9.3 ○Stirling numbers

They are introduced by James Stirling in a purely algebraic setting (as connnection coefficients between monomials and falling polynomials) in 1730 and were rediscovered and given a combinatorial meaning by Masanobu Saka in 1782 [Mansour and Schork, 2015]. In the following, the definitions will be given by combinatorics and their algebraic meaning will be stated as theorems.

9.3.1 of the second kind

Definition 9.6. The Stirling numbers of the second kind, written $𝑆 (𝑛, 𝑘)$ , are defined as the number of ways to partition a set of $𝑛$ labeled objects into $𝑘$ nonempty unlabeled subsets.

We have $𝑆 (𝑝, 𝑘) = 0$ for $𝑘 > 𝑝$ and $𝑆 (𝑛, 0) = 0$ for $𝑛 \geq 1$ and assume $𝑆 (0, 0) = 1$ .

Definition 9.7. The $𝑛$ -th Bell number $𝐵_{𝑛}$ is the number of partitions of an $𝑛$ -set into nonempty unlabeled subsets, that is,

𝐵_{𝑛} = {\sum︁}_{𝑘 = 0}^{𝑛} 𝑆_{𝑛, 𝑘} .

Proposition 9.3. The Stirling numbers of the second kind satisfy the recurrence relation

𝑆 (𝑛, 𝑘) = 𝑆 (𝑛 - 1, 𝑘 - 1) + 𝑘𝑆 (𝑛 - 1, 𝑘), 𝑛 > 𝑘 \geq 1

with initial conditions

𝑆 (0, 0) = 𝑆 (𝑛, 𝑛) = 1 \forall 𝑛 \geq 0, 𝑆 (𝑛, 0) = 0 \forall 𝑛 \geq 1, 𝑆 (𝑛, 1) = 1 \forall 𝑛 \geq 1 .

Proof. The initial conditions are easy to show.

To prove the recurrence, consider how to put the $𝑛$ -th object based on partition of the first $(𝑛 - 1)$ objects:

Either as a singleton
Or join it into one of the previous subsets.

□

Definition 9.8. Let $𝑝$ be a non-negative integer. The falling factorial polynomials in variable $𝑥 \in R$ are defined by (refers to Defn 3.12)

(𝑥)_{𝑝} : = 𝑥^{\underset{̲}{𝑝}} : = 𝑝! (\binom{𝑥}{𝑝}) = {\begin{matrix} 1, & if 𝑝 = 0 \\ 𝑥 (𝑥 - 1) \dots (𝑥 - 𝑝 + 1), & if 𝑝 \geq 1 . \end{matrix}

The rising factorial polynomials are defined by

(𝑥)^{𝑝} : = 𝑥^{\bar{𝑝}} : = 𝑥 (𝑥 + 1) \dots (𝑥 + 𝑝 - 1) .

When $𝑛$ is a non-negative integer, we have
$(𝑛)_{𝑝} = 𝑃 (𝑛, 𝑝) .$

Specially, for $𝑝 > 𝑛$ , $(𝑛)_{𝑝} = 𝑛 (𝑛 - 1) \dots (0) \dots (𝑛 - 𝑝 + 1) = 0$ .
The coefficient of $𝑥^{𝑘}$ in the expansion of $(𝑥)_{𝑝}$ and $(𝑥)^{𝑝}$ only differ in the sign.

Theorem 9.5 (algebraic meaning). Let $𝑥 \in R$ and $𝑝$ be a non-negative integer.

𝑥^{𝑝} = {∑︁}_{𝑘 = 0}^{𝑝} (\binom{𝑥}{𝑘}) 𝑘! 𝑆 (𝑝, 𝑘) = {\sum︁}_{𝑘 = 0}^{𝑝} 𝑆 (𝑝, 𝑘) (𝑥)_{𝑘} .

combinatorial, when $𝑥 = : 𝑛 \in N$ . $𝑘! 𝑆 (𝑝, 𝑘)$ is the number of ways to partition a set of $𝑝$ labeled objects into $𝑘$ nonempty labeled subsets.

Consider partitioning a set of $𝑛$ labeled objects into $𝑝$ labeled subsets of which $𝑘$ are nonempty. Then the number of ways of such partitions is $(\binom{𝑛}{𝑘}) 𝑘! 𝑆 (𝑝, 𝑘)$ .

Then, the number of ways to deliver $𝑝$ labeled objects into $𝑛$ labeled boxes (with empty boxes allowed) can be written as

𝑛^{𝑝} = {∑︁}_{𝑘 = 0}^{𝑝} (\binom{𝑛}{𝑘}) 𝑘! 𝑆 (𝑝, 𝑘) .

□

Theorem 9.6. Let $𝑥 \in R$ and $𝑝$ be a non-negative integer. Then Stirling numbers of the second kind satisfy

𝑆 (𝑝, 𝑘) = \frac{(Δ^{𝑘} 𝑛^{𝑝}) (0)}{𝑘!}, 𝑝 \geq 𝑘 \geq 0 .

Proof. The middle term in Theorem 9.5 takes the same form as Equation (9.4). Since the inverse of the process of taking the difference is the process of taking the sum (refer to the difference table), $(Δ^{𝑘} 𝑛^{𝑝}) (0) = 𝑘! 𝑆 (𝑝, 𝑘)$ . □

9.3.2 of the first kind

Definition 9.9. The (unsigned) Stirling numbers of the first kind, written $𝑐 (𝑝, 𝑘)$ , are defined as the number of permutations of $𝑝$ elements with $𝑘$ disjoint circles (Section 3.7.2).

We have $𝑐 (𝑝, 0) = 0$ for all $𝑝 \geq 1$ and assume $𝑐 (0, 0) = 1$ .

Proposition 9.4. For any $𝑝 > 𝑘 \geq 1$ , the (unsigned) Stirling numbers of the first kind satisfy

𝑐_{𝑝, 𝑘} = 𝑐_{𝑝 - 1, 𝑘 - 1} + (𝑝 - 1) 𝑐_{𝑝 - 1, 𝑘},

with initial conditions $𝑐 (𝑝, 𝑝) = 1$ for all $𝑝$ .

Proof. The permutations of a $𝑝$ -set can be divided into two kinds:

The singleton ${𝑝}$ forms a cycle by itself: $𝑐 (𝑝 - 1, 𝑘 - 1)$ .
$𝑝$ joins with other elements to form a cycle: There are $(𝑝 - 1)$ ways to join $𝑝$ to the cycles formed by the remaining $(𝑝 - 1)$ elements (unlike Proposition 9.3).

□

Theorem 9.7 (algebraic meaning). The signed Stirling numbers of the first kind are the coefficients of $𝑛^{𝑘}$ in the expansion the falling factorial $(𝑛)_{𝑝}$ :

(𝑛)_{𝑝} = {\sum︁}_{𝑘 = 0}^{𝑝} 𝑠 (𝑝, 𝑘) 𝑛^{𝑘} .

It is easy to see from the expansion of $(𝑛)_{𝑝}$ that the coefficient of $𝑛^{𝑘}$ is a product of $(𝑝 - 𝑘)$ negative numbers, so the sign of $𝑠 (𝑝, 𝑘)$ is $(- 1)^{𝑝 - 𝑘}$ . The unsigned Stirling numbers of the first kind satisfy

𝑐 (𝑝, 𝑘) = | 𝑠 (𝑝, 𝑘) | = (- 1)^{𝑝 - 𝑘} 𝑠 (𝑝, 𝑘) .

[TODO: rising polynomials]

Definition 9.10. The elementary symmetric polynomials $𝑠_{0}, 𝑠_{1}, \dots, 𝑠_{𝑝}$ in $𝑝$ variables $𝑥_{1}, \dots, 𝑥_{𝑝}$ are defined as the coefficients in the of $𝑦^{𝑘}$ in the expansion of the polynomial $(𝑦 + 𝑥_{1}) \dots (𝑦 + 𝑥_{𝑝})$ , i.e.,

(𝑦 + 𝑥_{1}) \dots (𝑦 + 𝑥_{𝑝}) = {\sum︁}_{𝑘 = 0}^{𝑝} 𝑠_{𝑝 - 𝑘} (𝑥_{1}, \dots, 𝑥_{𝑘}) 𝑦^{𝑘} .

Proof. [TODO: ] □

Plugging $𝑥_{1} = 0, 𝑥_{2} = - 1, \dots, 𝑥_{𝑝} = - 𝑝 + 1$ in the polynomial, we have

\begin{array}{l} 𝑦 (𝑦 - 1) \dots (𝑦 - 𝑝 + 1) & = (𝑦)_{𝑝} = {\sum︁}_{𝑘 = 0}^{𝑝} 𝑠_{𝑝 - 𝑘} (0, - 1, \dots, - 𝑝 + 1) 𝑦^{𝑘} \\ = {\sum︁}_{𝑘 = 0}^{𝑝} (- 1)^{𝑝 - 𝑘} 𝑠_{𝑝 - 𝑘} (0, 1, \dots, 𝑝 - 1) 𝑦^{𝑘} . \end{array}

Thus, we have $𝑠 (𝑝, 𝑘) = (- 1)^{𝑝 - 𝑘} 𝑠_{𝑝 - 𝑘} (0, 1, \dots, 𝑝 - 1)$ . We define $𝑐 (𝑝, 𝑘) = 𝑠_{𝑝 - 𝑘} (0, 1, \dots, 𝑝 - 1)$ .

Proposition 9.5. Let $𝑝 \geq 𝑗$ be non-negative integers.

{\sum︁}_{𝑘 = 𝑗}^{𝑝} 𝑆 (𝑝, 𝑘) 𝑠 (𝑘, 𝑗) = 𝛿_{𝑝𝑗} = {\sum︁}_{𝑘 = 𝑗}^{𝑝} 𝑠 (𝑝, 𝑘) 𝑆 (𝑘, 𝑗) .

Proof. Let $𝑥 \in R$ be a variable. Fix $𝑝$ . The following holds for any $𝑥$ :

𝑥^{𝑝} = {∑︁}_{𝑘 = 0}^{𝑝} 𝑆 (𝑝, 𝑘) (𝑥)_{𝑘} = {\sum︁}_{𝑘 = 0}^{𝑝} 𝑆 (𝑝, 𝑘) {∑︁}_{𝑗 = 0}^{𝑘} 𝑠 (𝑘, 𝑗) 𝑥^{𝑗} .

Thus, for any $0 \leq 𝑗 \leq 𝑝$ , the coefficient $\sum_{𝑘 = 𝑗}^{𝑝} 𝑆 (𝑝, 𝑘) 𝑠 (𝑘, 𝑗)$ of $𝑥^{𝑗}$ is $𝛿_{𝑝𝑗}$ .

Similar for the other part. □

10 Graph-related Problems

10.1 Matching

Definition 10.1 (Matching). Given a graph $𝐺 = (𝑉, 𝐸)$ , a subset $𝑀 \subseteq 𝐸$ is a matching if no two edges in $𝑀$ share a common endpoint.

A matching is said to be

maximum if there is no matching with more edges
perfect if it covers all nodes

10.2 Vertex Covering

Definition 10.2. A vertex subset $𝑈$ of a graph $𝐺 = (𝑉, 𝐸)$ is a vertex cover of $𝐺$ if every edge of $𝐺$ has an endpoint in $𝑈$ . A vertex cover is said to be minimum if there is no vertex cover with fewer vertices.

Proposition 10.1. Let $𝐺$ be a graph. For any matching $𝑀$ and vertex cover $𝐶$ of $𝐺$ , we have $| 𝑀 | \leq | 𝐶 |$ .

Proof. The vertex cover has to contain at least one endpoint of each edge in the matching. □

Corollary 10.1. Let $𝐺$ be a graph. If a matching $𝑀$ and a vertex cover $𝐶$ of $𝐺$ have the same cardinality, then $𝑀$ is a maximum matching and $𝐶$ is a minimum vertex covering.

Theorem 10.1 (Kőnig’s theorem [Kőnig, 1931]² ). If $𝐺$ is a bipartite graph, then the number of edges in a maximum matching equals the number of vertices in a minimum vertex covering.

Proof. TODO □

10.3 ○Systems of distinct representatives

Definition 10.3. Let $𝐴 = (𝐴_{1}, 𝐴_{2}, \dots, 𝐴_{𝑛})$ be a family of nonempty subsets of a given set $𝑋$ . A system of distinct representatives (SDR) of $𝐴$ is a family $𝑆 = (𝑎_{1}, 𝑎_{2}, \dots, 𝑎_{𝑛})$ of distinct members, where $𝑎_{𝑖} \in 𝐴_{𝑖}$ for all $𝑖$ .

It is easy to see that:

Proposition 10.2 (graph theoretical formulation). Let $𝐴 = (𝐴_{1}, 𝐴_{2}, \dots, 𝐴_{𝑛})$ be a family of nonempty subsets. Construct a bipartite graph $𝐺 = (𝐿 ⊔ 𝑅, 𝐸)$ with bipartite sets $𝐿 = 𝐴$ and $𝑅 = \cup_{𝑖 \in 𝑛} 𝐴_{𝑖}$ and edge set $𝐸 = {(𝐴, 𝑎) : 𝐴 \in 𝐿, 𝑎 \in 𝑅, 𝑎 \in 𝐴}$ . Then, $𝐴$ has an SDR iff $𝐺$ has a matching of size $𝑛$ (i.e., an $𝐿$ -perfect matching).

Definition 10.4. Let $𝐴 = (𝐴_{1}, 𝐴_{2}, \dots, 𝐴_{𝑛})$ be a family of nonempty subsets of a given set $𝑋$ . Then $𝐴$ is said to satisfy the marriage condition (aka Hall’s condition) iff

| ⋃_{𝑖 \in 𝐼} 𝐴_{𝑖} | \geq | 𝐼 |, \forall 𝐼 \subset [𝑛] .

It is obvious that the marriage condition is necessary for $𝐴$ to have a system of distinct representatives. However, it turns out that this condition is also sufficient:

Theorem 10.2 (Hall [1935]). The theorem has two equivalent formulations:

Set theoretical formulation: A family $𝐴 = (𝐴_{1}, 𝐴_{2}, \dots, 𝐴_{𝑛})$ of sets has a system of distinct representatives iff the marriage condition holds.
Graph theoretical formulation: Let $𝐺 = (𝐿 ⊔ 𝑅, 𝐸)$ be a finite bipartite graph with bipartite sets $𝐿$ and $𝑅$ . Then there is an $𝐿$ -perfect matching iff for every subset $𝐴 \subset 𝐿$ we have $| 𝐴 | \leq | 𝑁_{𝐺} (𝐴) |$ .

graph theoretical. We only need to prove that if there does not exist an $𝐿$ -perfect matching then the Hall’s condition does not hold.

If the maximum matching has fewer than $𝑛$ edges, then, by Kőnig’s Theorem 10.1, the minimum vertex covering also has fewer than $𝑛$ nodes. Let $𝑈$ be a minimum vertex covering. Let $𝐿_{1} = 𝐿 \cap 𝑈, 𝑅_{1} = 𝑅 \cap 𝑈, 𝐿_{2} = 𝐿 \ 𝐿_{1}$ . $| 𝑈 | = | 𝐿_{1} | + | 𝑅_{1} | < 𝑛$ .

Edges connected with $𝐿_{2}$ can only be covered by nodes in $𝑅_{1}$ , implying that $𝑁 (𝐿_{2}) \subseteq 𝑅_{1}$ . Since $| 𝑅_{1} | = | 𝑈 | - | 𝐿_{1} | < 𝑛 - | 𝐿_{1} | = | 𝐿 | - | 𝐿_{1} | = | 𝐿_{2} |$ , we have $| 𝐿_{2} | > | 𝑁 (𝐿_{2}) |$ , violating the Hall’s conditions.

□

10.4 Stable Marriage

Definition 10.5. The stable matching problem is the problem of finding a stable matching between two equally sized sets of elements $𝐿$ (men) and $𝑅$ (women), where each element in one set has a total ordering about the other set.

A complete marriage (i.e., matching) is unstable if there exists two women $𝐴, 𝐵 \in 𝐿$ and two men $𝑎, 𝑏 \in 𝑅$ such that

$𝐴$ and $𝑎$ get married. $𝐵$ and $𝑏$ get married.
$𝐴$ prefers $𝑏$ rather than $𝑎$ . $𝑏$ also prefers $𝐴$ rather than $𝐵$ .

In other words, there exist a pair of woman and man who both prefer each other to their current partner. Otherwise, the complete marriage is stable.

Theorem 10.3. A stable marriage always exists.

Algorithm 1 Gale and Shapley [1962]

Begin with all women marked as rejected, and no man rejected any woman.

0: If there is no rejected woman, then stop. Otherwise, go to Step 1.
1: Let each woman who is marked as rejected choose whom she ranks the highest among all men who have not yet rejected her.
Then go to Step 2.
2: Let each man pick whom he ranks the highest among all women who have chosen him and whom he has not yet rejected, defers to her decision, and reject the others.
Then go to Step 0.

Proof. [TODO: ] □

References

D. Gale and L. S. Shapley. College admissions and the stability of marriage. The American Mathematical Monthly, 69(1):9–15, 1962. doi: 10.1080/00029890.1962.11989827. URL https://doi.org/10.1080/00029890.1962.11989827.

P. Hall. On representatives of subsets. Journal of the London Mathematical Society, s1-10(1):26–30, 1935. doi: https://doi.org/10.1112/jlms/s1-10.37.26. URL https://londmathsoc.onlinelibrary.wiley.com/doi/abs/10.1112/jlms/s1-10.37.26.

D. Kőnig. Gráphok és mátrixok. Matematikai és Fizikai Lapok, 38, 1931. URL https://real-j.mtak.hu/7307/.

T. Mansour and M. Schork. Commutation Relations, Normal Ordering, and Stirling Numbers. Chapman and Hall/CRC, 09 2015. doi: https://doi.org/10.1201/b18869.

G. Szárnyas. Graphs and matrices: A translation of "graphok és matrixok" by dénes kőnig (1931), 2020. URL https://arxiv.org/abs/2009.03780.

Combinatorics

Contents

1 Misc

1.1 Perfect cover of chessboard

1.2 Magic square (of odd order)

2 Nim Game

2.1 Restricted version

2.2 Sprouts

3 Counting Principles

3.1 Basic Counting Principle

3.2 Pigeonhole Principle

3.3 Permutation

3.3.1 ○permutation of multisets

3.3.2 generating permutations

3.4 Combination

3.4.1 distinguishable boxes

3.4.2 ○combination with repetition / combinations of multisets

3.4.3 ○generating combinations

3.5 Binomial theorem and multinomial theorem

3.5.1 Generalized

3.5.2 Unimodality

3.6 Common Counting Methods

3.7 Other Properties of Permutations

3.7.1 ○inversions of permutations

3.7.2 ○cycles of permutations

4 Counting Applications

4.1 ○Partially ordered sets

4.1.1 Sperner’s Theorem

4.1.2 ○Dilworth Theorem

4.2 Graph

4.3 Ramsey Theory

4.3.1 definitions

4.3.2 properties

4.3.3 TODO proof of the theorem

5 Inclusion-exclusion principle

5.1 weighted version

5.2 derangements

5.2.1 [TODO: circular non-consecutive permutations]

5.3 permutation with forbidden positions

5.4 [TODO: nonconsecutive permutations]

5.5 Euler totient function

5.6 rook polynomials

6 ○Möbius inversion

6.1 Möbius inversion

6.2 product posets

6.3 ○[TODO: number theoretical]

6.4 ○circular permutations with repetition

7 Linear Recurrence

7.1 ○structural theorems

7.2 constant coefficients

7.2.1 homogeneous

7.2.2 nonhomogeneous

8 ○Generating Functions

8.1 ordinary generating function

8.2 exponential generating function

8.3 connection with linear recurrence

8.4 Catalan Sequence

9 Special Counting Sequences

9.1 ○Catalan Sequence

9.1.1 [TODO: A geometric example]

9.2 Difference sequences

9.3 ○Stirling numbers

9.3.1 of the second kind

9.3.2 of the first kind

10 Graph-related Problems

10.1 Matching

10.2 Vertex Covering

10.3 ○Systems of distinct representatives

10.4 Stable Marriage

References