Combinatorics


December 25, 2024

Contents

1 Misc
1.1 Perfect cover of chessboard
1.2 Magic square (of odd order)
2 Nim Game
2.1 Restricted version
2.2 Sprouts
3 Counting Principles
3.1 Basic Counting Principle
3.2 Pigeonhole Principle
3.3 Permutation
3.3.1 permutation of multisets
3.3.2 generating permutations
3.4 Combination
3.4.1 distinguishable boxes
3.4.2 combination with repetition / combinations of multisets
3.4.3 generating combinations
3.5 Binomial theorem and multinomial theorem
3.5.1 Generalized
3.5.2 Unimodality
3.6 Common Counting Methods
3.7 Other Properties of Permutations
3.7.1 inversions of permutations
3.7.2 cycles of permutations
4 Counting Applications
4.1 Partially ordered sets
4.1.1 Sperner’s Theorem
4.1.2 Dilworth Theorem
4.2 Graph
4.3 Ramsey Theory
4.3.1 definitions
4.3.2 properties
4.3.3 TODO proof of the theorem
5 Inclusion-exclusion principle
5.1 weighted version
5.2 derangements
5.2.1 [TODO: circular non-consecutive permutations]
5.3 permutation with forbidden positions
5.4 [TODO: nonconsecutive permutations]
5.5 Euler totient function
5.6 rook polynomials
6 Möbius inversion
6.1 Möbius inversion
6.2 product posets
6.3 [TODO: number theoretical]
6.4 circular permutations with repetition
7 Linear Recurrence
7.1 structural theorems
7.2 constant coefficients
7.2.1 homogeneous
7.2.2 nonhomogeneous
8 Generating Functions
8.1 ordinary generating function
8.2 exponential generating function
8.3 connection with linear recurrence
8.4 Catalan Sequence
9 Special Counting Sequences
9.1 Catalan Sequence
9.1.1 [TODO: A geometric example]
9.2 Difference sequences
9.3 Stirling numbers
9.3.1 of the second kind
9.3.2 of the first kind
10 Graph-related Problems
10.1 Matching
10.2 Vertex Covering
10.3 Systems of distinct representatives
10.4 Stable Marriage

1 Misc

1.1 Perfect cover of chessboard

Definition 1.1 (b-omino). A 𝑏-omino is a (1,𝑏)-board or a (𝑏,1)-board, where 𝑏 2.

Definition 1.2 (Perfect cover). A perfect cover of an 𝑚-by-𝑛 chessboard by 𝑏-ominoes is an arrangement of 𝑏-ominoes on the chessboard so that no two 𝑏-ominoes overlap.

Theorem 1.1. An 𝑚-by-𝑛 chessboard has a perfect cover by 𝑏-ominoes if and only if 𝑏 divides either 𝑚 or 𝑛.

Proof. side is obvious: Place all dominoes vertically or horizontally.

side: Consider the cyclic coloring (generalization of the alternate black-white coloring):

+1 1 [ 1 2 𝑏 1 𝑏 1 𝑏 1 𝑏 𝑏 1 𝑏 𝑏 2 𝑏 1 2 3 1 2 1 2 𝑏 1 ] .

It follows that, no matter how the 𝑏-ominos are placed, each of them will over exactly one square of each of the 𝑏 colors.

We color the 𝑚×𝑛-board in this way. Suppose 𝑚 = 𝑝𝑏 +𝑟,0 𝑟 < 𝑏 and 𝑛 = 𝑞𝑏 +𝑠,0 𝑠 < 𝑏. Then, we divide the chessboard as

[ 𝑝𝑏 × 𝑛 𝑟 ×𝑞𝑏 | 𝑟 ×𝑠 ].

Apparently, the 𝑝𝑏 ×𝑛 and 𝑟 ×𝑞𝑏 parts have the same number of squares for each color. If the whole board has a perfect covering, then the remaining 𝑟 ×𝑠 part must also have the same number of squares for each color. WLOG, we can assume that 𝑟 < 𝑠 (if not, switch the two dimensions of the board). Then, the top left corner of the 𝑟 ×𝑠 part must be 1. By the nature of the cyclic coloring, 1 occurs 𝑟 times (look at the diagonal) in this part, thus, so do other colors. That is, 𝑟𝑏 = 𝑟𝑠. We can infer from this that 𝑟 = 0. □

1.2 Magic square (of odd order)

Definition 1.3 (magic square). A magic square of order 𝑛 is an 𝑛×𝑛-array constructed out of the integers 1,2,,𝑛2 in such a way that the sum of each row, each column, and each diagonal is the same. The sum is called the magic sum of the magic square.

Method 1.1 (De La Loubère’s method). This is only for magic square of odd order. We consider a double infinite chessboard with the (𝑖,𝑗)-th entry denoted by 𝑎𝑖,𝑗, where 𝑎𝑖+𝑛,𝑗 = 𝑎𝑖,𝑗+𝑛 = 𝑎𝑖,𝑗.

Steps:

1.
𝑎1,(𝑛+1)/2 = 1.
2.
𝑎𝑖1,𝑗+1 := 𝑎𝑖,𝑗 +1 when 𝑛 𝑎𝑖,𝑗
3.
𝑎𝑖+1,𝑗 := 𝑎𝑖,𝑗 +1 when 𝑛|𝑎𝑖,𝑗 and 𝑎𝑖,𝑗 < 𝑛2.

2 Nim Game

Definition 2.1 (Nim Game). Suppose there are 2 players and 𝑘 heap containing 𝑛1,,𝑛𝑘 coins, respectively.

Rules:

1.
The two players alternate turns. Player I plays first.
2.
Each player, when it is their turn, selects one of the heaps and removes one or more coins from it.

The game ends when all the heaps are empty. The player who takes the last coin wins.

Definition 2.2. Writing the sizes of heaps as binary numbers. A nim game is called balanced if the sums of bits are even on all positions.

Proposition 2.1 (From balanced to unbalanced). The removal of any amount of coins from any heap breaks the balance.

Proof. In binary representation, this is just subtracting multiple 1s, which turns some but necessarily all even positions to odd. This is because some position with sum 0 will borrow a 1 from the next position. A concrete example: two piles 110 and 110. Subtract 011 from the first pile, which becomes 0 1 1. The sums become 1 2 1 from 2 2 0. □

Proposition 2.2 (From Unbalanced to Balanced). A player can always turn an unbalanced state to a balanced one.

Proof. Suppose we have 𝑘 heaps and at least one of them is non-empty.

1.
Find the odd bits. We need to choose a heap and flip its bits on these positions.
2.
Notice that there might be some odd position where the bit of this heap is 0. Flipping 0 to 1 is not trivially doable (will increase the size of bit).
3.
For every of such positions, we need to “borrow” some 1-bit on higher (or equal) positions, i.e., flipping that bit to 0 so that we have the budget to flip this bit to 1.
4.
Only need one higher (or equal) 1-bit.
5.
Can just choose the heap with 1 on the highest odd position.

2.1 Restricted version

Suppose we add the restriction that each player can move only 𝑝 {1,2,,𝑞} coins each time.

2.2 Sprouts

3 Counting Principles

3.1 Basic Counting Principle

Definition 3.1 (Product Rule).

e.g. There are three cities A, B and C. Suppose you start from A, pass through B and end at C. There are 3 ways from A to B, 2 ways from B to C. Then the number of total ways from A to B to C is 3 ×2 = 6.

Definition 3.2 (Sum Rule).

e.g. There are 3 method to transport from A to B, by train, by car and by plane. These 3 method have 1,2,3 choices, respectively. Then there are total 1 +2 +3 = 6 ways.

3.2 Pigeonhole Principle

Theorem 3.1 (Pigeonhole Principle). Different descriptions:

Proof: by contradiction

Example 3.1. 25 points are marked inside the area bounded by a rectangle that is 6 cm long and 4 cm wide. Show that there are at least 2 points that are at most 2 cm apart.

Proof: Divide the area into 24 squares of 1 ×1 cm. By the pigeonhole Principle, at least 2 points will be in the same square, The biggest distance between a square is 2 cm.

Theorem 3.2 (Pigeonhole Principle, Strong form). Different descriptions:

Proof: by contradiction.

Remark 1. Why I prefer the second description? The first description implies that (1) the objects are identical, (2) boxes have no size limits, and (3) the objects are distributed into boxes uniformly at random. While this scenario is valid, many applications of the Pigeonhole principle involve more complex conditions and relations between objects and boxes. Relying solely on the first description may make it unobvious or difficult to apply the theorem.

See

It is common to these examples that they cannot be simply modeled by randomly putting identical objects into boxes.

Corollary 3.1. When 𝑞1 = = 𝑞𝑘 = 𝑞, we have the following: If 𝑘(𝑞 1)+1 objects are placed into 𝑘 boxes, then there is at least one box containing 𝑞 or more objects.

Example 3.2. If we draw 𝑛+1 distinct integers from {1,,2𝑛}, then there are always two which differ by 1.

Let 𝑥2𝑖1,2𝑖 be a non-negative integer variable representing the number of integers drawn from 2𝑖 1 and 2𝑖. We have 𝑥2𝑖1,2𝑖 2, but this upper bound does not affect us applying PP to derive 𝑖,𝑥2𝑖1,2𝑖 2.

Example 3.3. In a room there are 10 people. Their ages are given in whole numbers only and are between 1 and 60 (both included). Then, one can always find two disjoint non-empty groups of people (no need to cover all people) the sum of whose ages is the same. Can 10 be replaced by a smaller number.

(An observation irrelevant to PP) It suffices to find two different (but possibly overlapping) subsets of people with the same sum of ages. This is because we can take the intersection out of the two subsets to form two disjoint groups satisfying the condition (think of why).

For 10 people, there are 210 = 1024 subsets. The possible sum of ages ranges from 1 to 600. Let 𝑥𝑖 be a non-negative integer variable representing the number of subsets with the sum equal to 𝑖. Then, 𝑥1 + +𝑥600 = 1024𝑖,𝑥𝑖 2. Note that we allow some redundancy here. E.g., the empty subset and the entire subset.

The result also holds for 9 people, though not that trivially provable because 29 60 ·9. We can prune out the easy case where there exists two people with the same age. Then, the sum of age can be at most 60 +59 + +52 = 504.

Actually, the results still hold for 8 people but not 7 people. It’s no longer easy to use PP for the proof, which is reasonable because we only apply PP to partial information.

Example 3.4. Consider an undirected graph with 100 nodes. Each node has an even degree. Then, there must be three nodes with the same degree.

Let 𝑥𝑖 be a non-negative integer variable representing the number of nodes with degree 𝑖, for 𝑖 = 0,2,,98. Then, a necessary but not sufficient condition for the problem is that 𝑥0 +𝑥2 + +𝑥98 = 100. With this we can only infer that 𝑖,𝑥𝑖 2. To approach the intended result, we have to prune the case a bit:

3.3 Permutation

Definition 3.3 (Permutation). Let 𝑛,𝑘 be two integers with 0 𝑘 𝑛. A 𝑘-permutation of a set of 𝑛 distinct elements is an ordered selection of 𝑟 elements arrangement from the set. The number of such permutations is denoted by 𝑃(𝑛,𝑘)

Theorem 3.3.

𝑃(𝑛,𝑘)= 𝑖=𝑛𝑘+1𝑛𝑖 = 𝑛! (𝑛𝑘)!

Definition 3.4 (Circular permutation). A circular 𝑘-permutation of a set is an ordered selection of 𝑟 elements from 𝑆 arranged as a circle.

Proposition 3.1. The number of circular 𝑘-permutations of an 𝑛-set equals 𝑃(𝑛,𝑘) 𝑘 .

3.3.1 permutation of multisets

Generalized to multisets.

Definition 3.5. Let 𝑀 be a multiset of type (𝑛1,,𝑛𝑘) (𝑘 distinct objects; 𝑛𝑖 is the count of the 𝑖-th one). An 𝑟-permutation of the multiset 𝑀 is an ordered arrangement of 𝑟 objects from 𝑀, where the order among identical objects is disregarded.

Theorem 3.4. The number of 𝑛-permutation of a multiset of type (𝑛1,,𝑛𝑘), where 𝑛1 + +𝑛𝑘 = 𝑛, is

𝑛! 𝑛1!𝑛𝑘!.

Proof. Direct proof: Divide the number of permutations within each type of objects.

Alternatively, we can view this problem as distributing 𝑛 distinct positions to each type of objects, which becomes this combination problem 3.7. The number of ways is ( 𝑛 𝑛1,,𝑛𝑘) (See below). □

3.3.2 generating permutations

Method 3.1. We assign a direction to each integer 𝑘: 𝑘 or 𝑘 . A permutation where each integer is given a direction is called a directed permutation. An integer 𝑘 in a directed permutation is called mobile if its arrow points to a smaller integer adjacent to it.

Now we present an algorithm to generate permutations of {1,,𝑛} using directed permutations:

1.
Start with 1 2 𝑛 .
2.
Find the largest mobile integer 𝑚.
3.
Swap 𝑚 and the adjacent integer pointed to by the direction of 𝑚.
4.
Switch the directions for all integers 𝑝 > 𝑚.
5.
Write down the resulting permutation and return to step 2.
6.
Stop if there is no mobile integer.

E.g., 1 2 3 , 1 3 2 , 3 1 2 , 3 2 1 , 2 3 1 , 2 1 3 .

Proof: TODO

3.4 Combination

Definition 3.6 (Combination). A 𝑘-combination of a set of 𝑛 distinct elements is unordered selection of 𝑘 𝑛 elements from the set.

The number of 𝑘-combination of a set of 𝑛 distinct elements is denoted by (𝑛 𝑘)(\binom{n}{k}).

Theorem 3.5.

( 𝑛 𝑘) = 𝑛! (𝑛𝑘)!𝑘! = 𝑛(𝑛1)(𝑛𝑘 +1) 𝑘!

Proof. 𝑘-permutation can be obtained by ordering a 𝑘-combination. Consequently,

𝑃(𝑛,𝑘)=( 𝑛 𝑘)𝑃(𝑘,𝑘).

From 𝑃(𝑛,𝑘) and 𝑃(𝑘,𝑘), we can derive (𝑛 𝑘). □

Corollary 3.2.

( 𝑛 𝑘) =( 𝑛 𝑛𝑘)

Proof: Combinatorially by bijection principle.

Bijection Principle:

_______________________________________________________________________________________

Proposition 3.2. (2𝑛 𝑛) = 𝑘=0𝑛(𝑛 𝑘)( 𝑛 𝑛𝑘) = 𝑘=0𝑛(𝑛 𝑘)2.

_______________________________________________________________________________________

3.4.1 distinguishable boxes

This is another way to solve “Permutation with multisets”.

Definition 3.7. An ordered collection of 𝑘 disjoint subsets 𝐴1,,𝐴𝑘 of an 𝑛-set (a partition with order) is called a combination of 𝑛 objects of type (𝑛1,,𝑛𝑘) if 𝑛1 = |𝐴1|,,𝑛𝑘 = |𝐴𝑘| and 𝑛 = 𝑛1 + +𝑛𝑘. The number of such combinations is denoted by ( 𝑛 𝑛1,,𝑛𝑘).

In other words, this is the number of ways to distribute 𝑛 distinguishable objects to 𝑘 distinguishable boxes so that the sizes are (𝑛1,,𝑛𝑘).

Theorem 3.6.

( 𝑛 𝑛1,,𝑛𝑘) = 𝑛! 𝑛1!𝑛𝑘!.

Proof. Similar. Think of the segment ( 𝑗=1𝑖1𝑛𝑗,𝑗=1𝑖𝑛𝑗] as the objects distributed to the 𝑖-th box. The order within this segment (box) does not matter, so we divide the number of interior permutations. We can map such a partition uniquely to a combination of type 𝑛1,,𝑛𝑘), and vice versa.

Alternatively, it is equal to

( 𝑛 𝑛1)( 𝑛𝑛1 𝑛2) (𝑛𝑛1 𝑛2 𝑛𝑘 𝑛𝑘) = 𝑛! 𝑛1! · ·𝑛𝑘!.

3.4.2 combination with repetition / combinations of multisets

Definition 3.8. An 𝑟-combination with repetition of a set 𝑆 of 𝑛 distinguishable objects is a selection of 𝑟 elements where each object can be selected unlimited times. The number of 𝑟-combinations with repetition is denoted by 𝑛 𝑟 .

PIC
Figure 1: Selecting balls

Let 𝑀 = {∞𝑥|𝑥 𝑆} be a multiset comprising infinitely many copies of objects in 𝑆. Then the combination can also be viewed as an 𝑟-combination of the multiset 𝑀 (without repetition, in usual sense).

E.g., objects: {1,2,3}, a valid selection of 4 elements: {1,1,3,3}.

Proposition 3.3. 𝑛 𝑟 is the number of ways to distribute 𝑟 indistinguishable balls to 𝑛 distinguishable boxes.

PIC
Figure 2: Distributing balls

Interpretation: Every time choose one box to put in an object, with repetition allowed. Repeat 𝑟 times.

Theorem 3.7.

𝑛 𝑟 =( 𝑟 +𝑛1 𝑟) =( 𝑟 +𝑛1 𝑛1) .

Proof. Stars and Bars: This problem can be represented as a sequence of 𝑛1 bars and 𝑟 stars. The 𝑛1 bars (together with head and tail of the sequence) mark 𝑛 cells, which correspond to 𝑛 objects respectively. The 𝑟 stars correspond to 𝑟 selections, and the number of stars in 𝑖-th cell represents the times the 𝑖-th object was selected. Note that there is a bijection from above sequences to 𝑟-combinations, then the answer is the number of ways to choose 𝑛1 OR 𝑟 positions out of 𝑛1 +𝑟. □

Corollary 3.3. 𝑛 𝑟 is the number of solutions for the following equation of non-negative variables: 𝑥1 + +𝑥𝑛 = 𝑟.

Interpretation: Every time choose one variable (initially 0) to add 1, with repetition allowed. Repeat 𝑟 times.

Example 3.5.

Exercise

Find the number of ways to select 𝑚 distinct numbers from [𝑛] so that no two numbers differ by 1.

Solution 1

Use the difference between the selected elements to rewrite the condition. Let 𝑎1,,𝑎𝑚 be the selected elements sorted in ascending order. Let Δ𝑖 = 𝑎𝑖+1 𝑎𝑖 for 𝑖 = 1,,𝑚1, Δ0 = 𝑎1 0, and Δ𝑚 = 𝑛+1 𝑎𝑚. Then the conditions are equivalent to:

∑︁ 𝑖=0𝑚Δ 𝑖 = 𝑛+1 0,Δ𝑖 2𝑖 [𝑚1],Δ0,Δ𝑚 1.

The number of the solutions can be derived using the corollary above, 𝑛+12𝑚 𝑚 .

Solution 2

We add a number 𝑛+1 and combine 𝑖 +1 with 𝑖 for 𝑖 [𝑛]. Then, the condition is equivalent to choosing 𝑚 (combined) items such that no two overlap. There are (𝑛+1 2𝑚) empty spaces left, so the number of ways equals to the number of arrangements of 𝑚 (identical) dominos and (𝑛+1 2𝑚) (identical) empty spaces.

Example 3.6.

Exercise

Find the number of solutions to the system: 𝑥1 + +𝑥𝑛 𝑚,𝑥𝑖 0𝑖 [𝑛].

Solution

Let 𝑥𝑛+1 = 𝑚(𝑥1 + +𝑥𝑚). Then, the condition is equivalent to 𝑥𝑛+1 0. Thus, the system becomes 𝑥1 + +𝑥𝑛+1 = 𝑚,𝑥𝑖 0𝑖 [𝑛+1]. Note that the solutions to the two systems have one-to-one correspondence. The answer is 𝑛+1 𝑚 .

3.4.3 generating combinations

That is, to generate the power set of a set.

Definition 3.9 (unit cube). The unit 𝑛-cube 𝑄𝑛 is a graph whose vertex set is the set of 0-1 sequences of length 𝑛, where two sequences are adjacent iff they differ in only one place.

Definition 3.10 (Gray code). A Gray code of order 𝑛 is a path of 𝑄𝑛 that visits every vertex of 𝑄𝑛 exactly once, i.e., a Hamilton path of 𝑄𝑛.

A Gray code is called cyclic if it is a closed path (the first vertex is visited twice), i.e., a Hamilton cycle of 𝑄𝑛.

We want to generate the power set as a (special) Gray code.

Definition 3.11 (reflected Gray code). The Gray codes obtained in the following way are called reflected Gray codes. It’s based on induction:

It is easy to see that these are valid Gray codes.

Derivation 3.1. Note that, starting with 00, the sums of bits of the sequences in the reflected Gray code alternate in parity.

Based on the inductive construction above, we observe that:

Method 3.2. The following algorithm generates the reflected Gray code of order 𝑛 in order:

1.
Begin with 𝑎𝑛𝑎𝑛1𝑎1 = 000.
2.
If 𝑎𝑛𝑎𝑛1𝑎1 = 100, stop.
3.
If 𝑖=1𝑛𝑎𝑖 is even, then flip 𝑎1.
4.
If 𝑖=1𝑛𝑎𝑖 is odd, then find 𝑗 = argmin 𝑖{𝑎𝑖 = 1} and flip 𝑎𝑗+1.
5.
Write down 𝑎 and return to Step 2.

3.5 Binomial theorem and multinomial theorem

Theorem 3.8 (Binomial theorem and multinomial theorem).

(𝑥 +𝑦)𝑛 = ∑︁ 𝑖=0𝑛(𝑛 𝑖) 𝑥𝑖𝑦𝑛𝑖,

(𝑥1++𝑥𝑘)𝑛 = ∑︁ 𝑖1++𝑖𝑘=𝑛 𝑖1,,𝑖𝑘0 ( 𝑛 𝑖1,,𝑖𝑘)𝑥1𝑖1 𝑥𝑘𝑖𝑘 .

Example 3.7. Let 𝑛 be a positive integer. Then

∑︁ 𝑘=0𝑛(1)𝑘(𝑛 𝑘)2 = {0, if 𝑛 is odd (1)𝑛/2 ( 𝑛 𝑛/2) ,if 𝑛 is even

3.5.1 Generalized

Definition 3.12 (Binomial coefficients). For any real number 𝛼 and integer 𝑘, define the binomial coefficients

( 𝛼 𝑘) = {0 if 𝑘 < 0 1 if  𝑘 = 0 𝛼(𝛼 1)(𝛼 𝑘 +1)/𝑘!if 𝑘 > 0 .

Proposition 3.4 (Reciprocity law of binomial coefficients).

( 𝛼 𝑘) = (1)𝑘(𝛼 +𝑘 1 𝑘) .

Corollary 3.4. For any positive integer 𝑛,

( 𝑛 𝑘) = (1)𝑘𝑛 𝑘.

Theorem 3.9 (Newton’s binomial expansion). Let 𝛼 be a real number. If |𝑧|< 1, then

(1 +𝑧)𝛼 = ∑︁ 𝑘=0(𝛼 𝑘)𝑧𝑘.

Consequently, we have, for |𝑥|< |𝑦|:

(𝑥 +𝑦)𝛼 = ∑︁ 𝑘=0(𝛼 𝑘)𝑥𝑘𝑦𝛼𝑘.

Proof. Apply the Taylor expansion for (1 +𝑧)𝛼 around 𝑧 = 0:

(1 +𝑧)𝛼 = ∑︁ 𝑘=0(𝑧 0)𝑘𝑓𝑘(0) = ∑︁ 𝑘=0𝑧𝑘𝛼(𝛼 1)(𝛼 𝑘 +1)/𝑘!.

TODO: It can be shown that the R.H.S. has a radius of convergence 1. Thus, the formula holds for |𝑧|< 1.

Let 𝑧 := 𝑥/𝑦. Then we can derive the second formula. □

3.5.2 Unimodality

Definition 3.13 (Unimodality). A sequence 𝑠0,𝑠1,,𝑠𝑛 is said to be unimodal if there is an integer 𝑘 (0 𝑘 𝑛) such that 𝑠0 𝑠1 𝑠𝑘 𝑠𝑘+1 𝑠𝑛.

E.g., the sequence of binomial coefficients (integer) is unimodal.

Definition 3.14 (log-concave). A sequence 𝑠0,𝑠1,,𝑠𝑛 of positive numbers is said to be log-concave if 𝑠𝑖2 𝑠𝑖1𝑠𝑖+1 for 𝑖 = 1,,𝑛1. This condition is equivalent to that the sequence log 𝑠0,log 𝑠1, ,log 𝑠𝑛 is concave, i.e., log 𝑠𝑖 (log 𝑠𝑖1 +log 𝑠𝑖+1)/2.

Proposition 3.5. Each log-concave (positive) sequence 𝑠0,𝑠1,,𝑠𝑛 is unimodal.

Proof. 𝑠𝑖2 𝑠𝑖1𝑠𝑖+1 is equivalent to 𝑠𝑖 𝑠𝑖1 𝑠𝑖+1 𝑠𝑖 . Discuss if 1 is between two adjacent ratios. □

3.6 Common Counting Methods

Four methods:

1.
Combinatorial reasoning: Show that the LHS and the RHS are different ways in counting the same thing.
2.
Binomial expansion: Equation (3.8).
3.
Direct computation
4.
Induction (on 𝑛 and 𝑘 etc.)

3.7 Other Properties of Permutations

3.7.1 inversions of permutations

Definition 3.15 (inversion, inversion sequence, total disorder, sign). Let 𝜋 be a permutation of [𝑛].

Theorem 3.10. For every (valid) inversion sequence, we can recover the original permutation using Method 3.3 or Method 3.4.

Method 3.3. Let (𝑎1,,𝑎𝑛) be a valid inversion sequence.

1.
Mark down 𝑛 empty spaces □□…□.
2.
In the order 𝑖 = 1,2,,𝑛, put 𝑖 into the (𝑎𝑖 +1)-th empty space from the left.

Method 3.4. Let (𝑎1,,𝑎𝑛) be a valid inversion sequence.

1.
Write down 𝑛.
2.
In the order 𝑖 = 𝑛1,𝑛2,,1, place 𝑖 after the 𝑎𝑖-th existing item from the left.

Proof: easy.

Example 3.8.

Exercise

Define a “move” of a permutation as selecting an element and inserting it somewhere in the permutation. Find an algorithm to compute the minimal number of moves to restore an arbitrary input permutation of [𝑛] back to the natural order.

Solution

Observe that if we move two disjoint subsets 𝑆 and 𝑉 of [𝑛], then the relative order among 𝑆 is the same as if we only perform the moves for 𝑆, and so as 𝑉.

Let 𝑓(𝑎1,,𝑎𝑛) be the minimal number of moves for 𝑎1,,𝑎𝑛. It is natural to consider first how to move 𝑛 to the right place (similar for 1). This can be done by moving 𝑛 directly or moving others to indirectly restore 𝑛 (or some combination). Suppose 𝑎𝑘 = 𝑛.

  • By the observation, if we ever move 𝑛 directly, we would better move it to the last position in one go, because no matter where we move 𝑛 to the relative order of the rest does no change. This option costs 𝑓(𝑎1,,𝑎𝑘1,𝑎𝑘+1,,𝑎𝑛1)+1 moves.
  • Thus, the other option is to move others only. We have to move 𝑎𝑘+1,,𝑎𝑛 to the left of 𝑎𝑘 because they are all smaller than 𝑎𝑘 = 𝑛. Again, by the observation, no matter how we move them, the relative order among 𝑎1,,𝑎𝑘1 does not change after the moves. We have to at least:

    1.
    move 𝑎𝑘+1,,𝑎𝑛 and restore the order among them, which costs 𝑛𝑘 moves,
    2.
    and restore the order among 𝑎1,,𝑎𝑘1, which costs 𝑓(𝑎1,,𝑎𝑘1) moves.

    It turns out that these are enough because we can do the 2nd part first followed by the 1st part, in which we restore the order across the two subsets by the way.

Thus, 𝑓(𝑎1,,𝑎𝑛)= min (𝑓(𝑎1, ,𝑎𝑘1,𝑎𝑘+1, ,𝑎𝑛1)+1,𝑓(𝑎1, ,𝑎𝑘1)+𝑛𝑘.

3.7.2 cycles of permutations

Definition 3.16 (Cycle). Let 𝑥1,,𝑥𝑘 be 𝑘 distinct elements in [𝑛]. We denote by (𝑥1𝑥2𝑥𝑘) the cycle 𝑥1 𝑥2 𝑥𝑘 𝑥1. Additionally, we view this as a function [𝑛][𝑛] mapping elements on the cycle to the next element and mapping other elements to themselves:

(𝑥1𝑥2𝑥𝑘)(𝑥)= {𝑥𝑖𝑚𝑜𝑑𝑘+1,if 𝑖,𝑥 = 𝑥𝑖 𝑥, otherwise .

Remark: A cycle (𝑥1𝑥2𝑥𝑘) is invariant to “rotation” or “cyclical permutation”. E.g., (𝑥𝑘𝑥1𝑥𝑘1) still represents the same cycle.

_______________________________________________________________________________________

Definition 3.17 (Transposition). A transposition is a cycle of length 2. An adjacent transposition is a transposition with adjacent elements, of the form (𝑖𝑗), where 𝑗 = 𝑖 ±1.

Derivation 3.2. Let (𝑖𝑗) be a non-adjacent transposition. Since (𝑖𝑗)= (𝑗𝑖), WLOG, we can assume 𝑗 𝑖 2. We try to insert some intermediate element 𝑘 (𝑖,𝑗) in between. If we break (𝑖𝑗) into (𝑗𝑘)(𝑘𝑖), then 𝑖 can be successfully swapped to position 𝑗, but this introduces the position change of 𝑘 and puts 𝑗 wrongly on position 𝑖. Then, we need to add another (𝑘𝑖) on the left to correct this. Thus, (𝑖𝑗)= (𝑘𝑖)(𝑗𝑘)(𝑘𝑖).

Generally, one can first use (𝑗,𝑗 1)(𝑗 1,𝑗 2)(𝑖 +1,𝑖) to swap 𝑖 to the target position 𝑗, with the effect of moving 𝑖 +1,,𝑗 to one position smaller.

PIC
Figure 3: Stage 1

Then, one can use (𝑖 +1,𝑖)(𝑖 +2,𝑖 +1)(𝑗 1,𝑗 2) to move 𝑖 +1,,𝑗 1 back, by the way swapping 𝑗 to 𝑖.

PIC
Figure 4: Stage 2

In summary,

(𝑖𝑗)= (𝑖 +1,𝑖)(𝑖 +2,𝑖 +1)(𝑗 1,𝑗 2)(𝑗,𝑗 1)(𝑗 1,𝑗 2)(𝑖 +1,𝑖)

Theorem 3.11. Any transposition can be represented as a composition of adjacent transpositions.

_______________________________________________________________________________________

Definition 3.18 (Standard Cycle Notation). Let 𝜋 be a permutation of [𝑛]. We derive the standard cycle notation as follows:

1.
Viewing 𝑥𝜋(𝑥) as directed edges, find all the cycles.
2.
Standardize each cycle by making the largest element in the first place. E.g., (238) becomes (823).
3.
Compose all standardized cycles (functions) in ascending order of the leading elements.

The resulting function is called the standard cycle notation of that permutation, denoted by cyc(𝜋). Obviously, we have 𝜋(𝑖)= cyc(𝜋)(𝑖).

E.g., 43865712(5)(7146)(823).

PIC

Figure 5: standard cycle notation example

Proposition 3.6. We can remove the parentheses in a standard cycle notation and still recover the permutation.

Proof. In each cycle, the leading element should be strictly greater than the following elements. On the other hand, the leading elements are increasing. So we can identify the end of a cycle by finding the first element greater than the leading element. □

Definition 3.19. Let cyc^(𝜋), where 𝜋 𝔖𝑛, denote the permutation of [𝑛] obtained from cyc(𝜋) by deleting all the parentheses.

Derivation 3.3. Let rec denote the recovering function in the proposition above. rec is by definition an inverse function of cyc^ (for any permutation 𝜋, rec(cyc^(𝜋))= 𝜋). Thus, the latter one is a bijection.

Proposition 3.7. cyc^ : 𝔖𝑛 𝔖𝑛 is bijection.

4 Counting Applications

4.1 Partially ordered sets

4.1.1 Sperner’s Theorem

Consider chains and anti-chains of subsets of 𝑆.

Theorem 4.1 (Sperner’s Theorem). Every anti-chain of subsets of an 𝑛-element set 𝑆 has size at most ( 𝑛 𝑛/2) and it is achievable.

Proof. Let 𝐴 be an arbitrary anti-chain of subsets of 𝑆 and 𝐶 an arbitrary chain. It is easy to easy that |𝐴𝐶| is either 0 or 1 from the properties the intersection has to satisfy.

Now consider an arbitrary partition 𝐶 of the power set 𝑃(𝑆) into disjoint chains 𝐶1,,𝐶𝑚. Then,

𝐴 = 𝐴𝑃(𝑆)= 𝐴 𝑖=1𝑚𝐶 𝑖 = 𝑖=1𝑚𝐴𝐶 𝑖.

It follows that |𝐴||𝐶|= 𝑚, which is the number of disjoint chains.

To find the 𝐴 with the largest size, it suffices to construct an 𝐴 and a partition such that their sizes are equal (it turned out to be feasible). The construction is done by induction on 𝑛:

On the other hand, if we pick the subset with size 𝑛/2 in each chain (can be shown that it exists and is unique), they form an antichain (can be shown using the size). The length of this antichain is equal to the size of the chain partition, which is equal to ( 𝑛 𝑛/2) because all subsets of size 𝑛/2 are included here. □

4.1.2 Dilworth Theorem

Lemma 4.1. Let (𝑋,) be a poset. For any chain 𝐶 and anti-chain 𝐴 of 𝑋,

|𝐴𝐶|{0,1}.

Lemma 4.2. Let (𝑋,) be a finite poset (as we are counting the elements). For any chain 𝐶 of 𝑋 and anti-chain partition 𝐴 of 𝑋,

|𝐶||𝐴|.

Likewise, for any anti-chain 𝐴 of 𝑋 and chain-partition 𝐶 of 𝑋,

|𝐴||𝐶|.

Proof. |𝐶|= |𝐶 𝑋|= |𝐶 (𝑈𝐴𝐴𝐴)|= |𝐴𝐴𝐶 𝐴|= 𝐴𝐶|𝐶 𝐴| 𝐴𝐴 = |𝐴|. Similar for the other one. □

The following theorems state that the equality can be achieved.

Theorem 4.2. Let (𝑋,) be a finite poset. Then,

max {|𝐶|: 𝐶 chains}= min {|𝐴|: 𝐴 anti-chain partitions}.

Proof. [TODO: ]

Theorem 4.3 (Dilworth theorem). Let (𝑋,) be a finite poset. Then,

max {|𝐴|: 𝐴 chains}= min {|𝐶|: 𝐶 chain partitions}.

Proof. [TODO: ]

4.2 Graph

Definition 4.1 (coloring). A coloring of a graph with 𝑡 colors is a function : 𝑉 [𝑡].

Definition 4.2. A coloring is called proper if no two adjacent nodes receive the same color. The number of proper colorings is a polynomial function of 𝑡, called the chromatic polynomial of 𝐺, denoted by 𝜒(𝐺,𝑡). Thus, it can be written in the form

𝜒(𝐺,𝑡)= ∑︁ 𝑘=0𝑛(1)𝑛𝑘𝑎 𝑘𝑡𝑘.

4.3 Ramsey Theory

4.3.1 definitions

We denote a complete graph of 𝑛 vertices by 𝐾𝑛.

Theorem 4.4 (Ramsey Number). Given 𝑟,𝑏 1. A (edge-)coloring of 𝐾𝑛 by colors 1,2 is said to satisfy the Ramsey property of type (𝑟,𝑏) if there exists

Ramsey theorem states that there exists a smallest integer, 𝑛 = 𝑅(𝑟,𝑏), called the Ramsey number, such that, any coloring of 𝐾𝑛 satisfies the Ramsey property. This condition on 𝑛 is sometines denoted as 𝐾𝑛 𝐾𝑟,𝐾𝑏.

No proof.

_______________________________________________________________________________________

Theorem 4.5 (Ramsey Theorem, more colors). Given 𝑞1,,𝑞𝑘 1. A 𝑘-(edge-)coloring of 𝐾𝑛 by colors 𝑐1,,𝑐𝑘 is said to satisfy the Ramsey property of type (𝑞1,,𝑞𝑘) if there exists

Ramsey theorem states that there exists a smallest integer, 𝑛 = 𝑅(𝑞1,,𝑞𝑘), called the Ramsey number, such that, any 𝑘-coloring of 𝐾𝑛 satisfies the Ramsey property. This condition on 𝑛 is sometines denoted as 𝐾𝑛 𝐾𝑞1,𝐾𝑞𝑘.

No proof

_______________________________________________________________________________________

We consider 𝑑-uniform hypergraph, in which edges are 𝑑-element subsets of the vertex set (i.e., undirectional). A 𝑑-uniform complete graph with vertex set 𝑉 has the collection of all 𝑑-element subsets of 𝑉 as its edge set. It’s called a complete 𝑑-family of size |𝑉|, denoted by 𝑃𝑑(𝑉):= {𝑆 𝑉||𝑆|= 𝑑}. The graph itself is denoted by 𝐾|𝑉|𝑑.

Theorem 4.6 (Ramsey Theorem, hypergraph). Given 𝑞1,,𝑞𝑘 𝑑 1. A 𝑘-(edge-)coloring of 𝐾𝑛𝑑 by colors 𝑐1,,𝑐𝑘 is said to satisfy the Ramsey property of type (𝑞1,,𝑞𝑘) if there exists

Ramsey theorem states that there exists a smallest integer, 𝑛 = 𝑅𝑑(𝑞1,,𝑞𝑘), called the Ramsey number, such that, any 𝑘-coloring of 𝐾𝑛𝑑 satisfies the Ramsey property. This condition on 𝑛 is sometines denoted as 𝐾𝑛𝑑 𝐾𝑞1𝑑,𝐾𝑞𝑘𝑑.

Proof: See a partial proof below.

4.3.2 properties

Proposition 4.1. 𝑅(3,3)= 6.

Proof. Pigeon principle. □

Proposition 4.2.

Proof.

_______________________________________________________________________________________

Proposition 4.3. 𝑅(𝑝,𝑞)𝑅(𝑝 1,𝑞)+𝑅(𝑞,𝑝 1).

Proof. It can be easily proved using Lemma 4.3. Here, we give a proof purely based on the definition.

Let 𝑛 = 𝑅(𝑝 1,𝑞)+𝑅(𝑞,𝑝 1). We want to show that any coloring of 𝐾𝑛 satisfies the property. Fix a coloring. Consider choosing an arbitrary vertex 𝑣. Let 𝑉1 = {𝑤 𝑉|𝑢𝑤 is in color 1} and define 𝑉2 analogously. By Pigeonhole Principle, either |𝑉1|𝑅(𝑝 1,𝑞) or |𝑉2|𝑅(𝑝,𝑞 1). WLOG, we assume the first case. Therefore,

Thus, 𝑛 satisfies the property. □

Corollary 4.1. 𝑅(𝑝,𝑞)( 𝑝+𝑞2 𝑝1) .

Proof. Induction on the sum of 𝑝 and 𝑞. (Assume it is true for 𝑝 +𝑞 < 𝑛, …) □

Proposition 4.4. Generalizing the proposition above, we obtain:

𝑅(𝑞1,,𝑞𝑘)1 (∑︁ 𝑖=1𝑘𝑅(𝑞 1,,𝑞𝑖1,𝑞𝑖 1,𝑞𝑖+1,,𝑞𝑘))𝑘 +1.

E.g., 𝑅(3,3,3)17.

_______________________________________________________________________________________

Proposition 4.5. The 𝑡 = 1 case is just Thm 3.2 Pigeonhole principle, strong form:

𝑅1(𝑞1,,𝑞𝑘)= 𝑞1 + +𝑞𝑘 𝑘 +1.

Lemma 4.3.

𝑅𝑡(𝑝,𝑞)= 𝑅𝑡1 [𝑅𝑡(𝑝 1,𝑞),𝑅𝑡(𝑝,𝑞 1)]+1.

Proof: TODO

Proposition 4.6.

𝑅𝑡(𝑞1,,𝑞𝑘)𝑅𝑡 [𝑞1,,𝑞𝑘2,𝑅𝑡(𝑞𝑘1,𝑞𝑘)].

Proof. Consider 𝑛 = 𝑅𝑡 [𝑞1,,𝑞𝑘2,𝑅𝑡(𝑞𝑘1,𝑞𝑘)]. Any □

4.3.3 TODO proof of the theorem

5 Inclusion-exclusion principle

Theorem 5.1. Let 𝑆be a finite set. Let 𝑃1,,𝑃𝑛 be properties referring to the objects of 𝑆. Let 𝐴𝑖 denote the set of elements satisfying 𝑃𝑖 in 𝑆. The number of elements in 𝑆satisfying none of the properties 𝑃1,,𝑃𝑛 is given by

|𝐴1¯ 𝐴𝑛¯|= |𝑆|∑︁ 𝑖|𝐴𝑖|+∑︁ 𝑖<𝑗|𝐴𝑖 𝐴𝑗|+ +(1)𝑛|𝐴 1 𝐴𝑛|.

Proof.

5.1 weighted version

Let 𝑆 be a set (either finite or infinite). Let R𝑆 denote the set of functions from 𝑆 to R. R𝑆 is a vector space (with usual addition and scalar multiplication). Moreover, R𝑆 is an algebra with usual product and identity 1𝑆.

Let 𝑤 : 𝑆 R be a weight function which is nonzero at only finitely many points.

Theorem 5.2. Let 𝑆be a finite or infinite set. Let 𝑃1,,𝑃𝑛 be properties referring to the objects of 𝑆. Let 𝐴𝑖 denote the set of elements satisfying 𝑃𝑖 in 𝑆. Then,

1𝐴1¯𝐴𝑛¯ = 1𝑆 +∑︁ 𝑘=1𝑛(1)𝑘 ∑︁ 𝑖1<<𝑖𝑘1𝐴𝑖1𝐴𝑖𝑘.

Given a weight function 𝑤 on 𝑆, as a corollary, we have

𝑤(𝐴1¯ 𝐴𝑛¯)= 𝑤(𝑆)+∑︁ 𝑘=1𝑛(1)𝑘 ∑︁ 𝑖1<<𝑖𝑘𝑤(𝐴𝑖1 𝐴𝑖𝑘).

Proof. [TODO: ]

Let 𝑆 be a finite set and 𝐴1,,𝐴𝑛 be subsets of 𝑆. We define two functions 𝛼 and 𝛽 on 𝑃([𝑛]) as follows:

𝛼()= 0,𝛼(𝐼)= 𝑤(𝑖𝐼𝐴𝑖) if 𝐼 𝛽()= 0,𝛽(𝐼)= 𝑤(𝑖𝐼𝐴𝑖) if 𝐼 .

Derivation 5.1. Applying the weighted inclusion-exclusion principle ((5.2)) to 𝐴𝑗 (𝑗 𝐽), we have

𝑤(𝑗𝐽𝐴𝑖¯) = 𝑤(𝑆)+∑︁ 𝐼𝐽,𝐼≠∅(1)|𝐼|𝑤( 𝑖𝐼𝐴𝑖) 0 = 𝑤(𝑗𝐽𝐴𝑖)+∑︁ 𝐼𝐽,𝐼≠∅(1)|𝐼|𝑤( 𝑖𝐼𝐴𝑖)

Applying it to 𝐴𝑗¯ (𝑗 𝐽) (complement of the properties), we have

𝑤(𝑗𝐽𝐴𝑖) = 𝑤(𝑆)+∑︁ 𝐼𝐽,𝐼≠∅(1)|𝐼|𝑤( 𝑖𝐼𝐴𝑖¯) = 𝑤(𝑆)+∑︁ 𝐼𝐽,𝐼≠∅(1)|𝐼|𝑤(𝑆 𝑖𝐼𝐴𝑖) = ∑︁ 𝑘=0𝑛(𝑛 𝑘)(1)𝑘𝑤(𝑆) =0 +∑︁ 𝐼𝐽,𝐼≠∅(1)|𝐼|+1𝑤( 𝑖𝐼𝐴𝑖)

Adding 𝛼() and 𝛽() to the above results, we get:

Proposition 5.1. For any 𝐽 [𝑛],

𝛽(𝐽)= ∑︁ 𝐼𝐽(1)|𝐼|+1𝛼(𝐼),

and

𝛼(𝐽)= ∑︁ 𝐼𝐽(1)|𝐼|+1𝛽(𝐼),

5.2 derangements

Definition 5.1. A permutation of [𝑛] is called a derangement if no 𝑖 [𝑛] is placed at the 𝑖-th position. The number of derangements of [𝑛] is denoted by 𝐷𝑛.

Proposition 5.2. The derangement sequence 𝐷𝑛 satisfies the recurrence relation

𝐷𝑛 = (𝑛1)(𝐷𝑛1 +𝐷𝑛2),𝑛 3

with the initial conditions 𝐷1 = 0,𝐷2 = 1.

Proof. Consider how to satisfy the derangement requirement for the 𝑛-th position. We can put any one from [𝑛1] at the 𝑛-th position. Fix 𝑘 [𝑛1]. There are two cases:

Thus, 𝐷𝑛 = (𝑛1)(𝐷𝑛2 +𝐷𝑛1). □

Derivation 5.2. Let 𝑃𝑖 denote the property that a permutation of [𝑛] has the integer 𝑖 in the 𝑖-th position. Let 𝐴𝑖 denote the set of all permutations satisfying 𝑃𝑖. We want to count 𝐴1¯ 𝐴2¯ 𝐴𝑛¯.

For 1 𝑖1 < < 𝑖𝑘 𝑛, |𝐴𝑖1 𝐴𝑖𝑘|= (𝑛𝑘)!. By the Inclusion-Exclusion principle, we have

𝐷𝑛 = |𝐴1¯ 𝐴2¯ 𝐴𝑛¯| = |𝑆|+∑︁ 𝑘=1𝑛(1)𝑘 ∑︁ 𝑖1<<𝑖𝑘|𝐴𝑖1 𝐴𝑖𝑘| = 𝑛! +∑︁ 𝑘=1𝑛(1)𝑘(𝑛 𝑘)(𝑛𝑘)! = 𝑛! ∑︁ 𝑘=0𝑛(1)𝑘 𝑘! .

Theorem 5.3. For 𝑛 1, the number of derangements of [𝑛] is

𝐷𝑛 = 𝑛! ∑︁ 𝑘=0𝑛(1)𝑘 𝑘! 𝑛! 𝑒 .

5.2.1 [TODO: circular non-consecutive permutations]

5.3 permutation with forbidden positions

Definition 5.2. Let 𝑋1,,𝑋𝑛 be subsets of [𝑛]. The set of permutations 𝑎1𝑎2𝑎𝑛 of [𝑛] such that

𝑎1 𝑋1,,𝑎𝑛 𝑋𝑛

is denoted by 𝑃(𝑋1,,𝑋𝑛).

Definition 5.3. A placement of 𝑛 indistinguishable rooks on an 𝑛-by-𝑛 board is called non-attacking if no two rooks share the same column or row.

For convenience, let us denote by 𝑆 the set of placements of 𝑛 non-attacking indistinguishable rooks on an 𝑛×𝑛-board. It is clear that there is a one-to-one correspondence between permutations of [𝑛] and placements in 𝑆 by placing the 𝑖-th rook to (𝑖,𝑎𝑖). A rook placement in 𝑆 is said to satisfy the property 𝑃𝑖 if the rook in the 𝑖-th row has column index in 𝑋𝑖. Let 𝐴𝑖 denote the set of rook placements satisfying 𝑃𝑖. Then,

|𝑃(𝑋1,,𝑋𝑛)| = |𝐴1¯ 𝐴𝑛¯| = |𝑆|+∑︁ 𝑘=1𝑛(1)𝑘 ∑︁ 𝑖1<<𝑖𝑘|𝐴𝑖1 𝐴𝑖𝑘|.

5.4 [TODO: nonconsecutive permutations]

Definition 5.4. A permutation of [𝑛] is called nonconsecutive if the consecutive words of length two

12,23,,,(𝑛1)𝑛

do not occur. We denote by 𝑄𝑛 the number of nonconsecutive permutations of [𝑛].

Theorem 5.4. For 𝑛 1,

𝑄𝑛 = ∑︁ 𝑘=0𝑛1(1)𝑘(𝑛1 𝑘) (𝑛𝑘)!.

Proof. Let 𝑃𝑖 be the property that 𝑖(𝑖 +1) occurs in the permutation and 𝐴𝑖 be the set of permutations with this property.

We have to know |𝐴𝑖1 𝐴𝑖𝑘 for 1 𝑖1 < < 𝑖𝑘 𝑛1. Notice that, unlike previous situations, the property 𝑃𝑖 involves two positions. Assume 𝑖1𝑖2𝑖𝑘 consists of 𝑚 consecutive segments, with length 𝑛1,,𝑛𝑚, respectively. 𝑛1 + +𝑛𝑚 𝑚 = 𝑘. Then, there are in total 𝑛𝑘 remaining elements and 𝑚 groups free to arrange. The number of permutations in this case is (𝑛𝑘 +(𝑛1 + +𝑛𝑚 𝑚))! = (𝑛𝑘)!, which happens to depend only on 𝑘. [TODO: interpretation]

Then,

𝑄𝑛 = |𝑆|+∑︁ 𝑘=0𝑛1(1)𝑘 ∑︁ 𝑚=1 𝑘 ∑︁ 1𝑖1<<𝑖𝑘𝑛1:𝑚 concesutive segments (𝑛𝑘)! = 𝑛! +∑︁ 𝑘=0𝑛1(1)𝑘 ∑︁ 1𝑖1<<𝑖𝑘𝑛1(𝑛𝑘)! = 𝑛! +∑︁ 𝑘=0𝑛1(1)𝑘(𝑛1 𝑘) (𝑛𝑘)!.

Theorem 5.5. For 𝑛 2,

𝑄𝑛 = 𝐷𝑛 +𝐷𝑛1.

Proof. [TODO: ]

5.5 Euler totient function

Definition 5.5. The Euler totient function is defined by

𝜑(𝑛)= # integers from 1 to n coprime to n,𝑛 Z+.

Theorem 5.6. Let 𝑛 be a positive integer factorized into the form

𝑛 = 𝑝1𝑒1 𝑝𝑟𝑒𝑟 ,

where 𝑝1,,𝑝𝑟 are distinct primes with powers 1. Then,

𝜑(𝑛)= 𝑛 𝑖=1𝑟 (1 1 𝑝𝑖 ) .

Proof. Hint: The number of integers from 1 to 𝑛 divisible by distinct primes 𝑝𝑖1,,𝑝𝑖𝑘 is 𝑛 𝑝𝑖1𝑝𝑖𝑘. Use the Inclusion-Exclusion principle. □

5.6 rook polynomials

6 Möbius inversion

Definition 6.1 (locally finite poset). A poset (𝑋,) is called locally finite, if for any 𝑥 𝑦 𝑋 the closed interval [𝑥,𝑦]:= {𝑧 𝑋|𝑥 𝑧 𝑦} is a finite set.

Definition 6.2. Let (𝑋,) be a locally finite poset. The incidence algebra of 𝑋 is a tuple (I(𝑋),), where

P.S. Why the variable in the convolution is taken [𝑥,𝑦] instead of 𝑋 as usual? Actually, under the assumption that the incidence functions have a value 0 for empty intervals, the usual convolution

𝑥,𝑦 𝑋 : (𝛼 𝛽)(𝑥,𝑦)= ∑︁ 𝑧𝑋𝛼(𝑥,𝑧)𝛽(𝑧,𝑦)

is still well-defined: If 𝑥⋠𝑦, either 𝑥⋠𝑧 or 𝑧⋠𝑦. Thus, either 𝛼(𝑥,𝑧)= 0 or 𝛽(𝑧,𝑦)= 0. Therefore, (𝛼 𝛽)(𝑥,𝑦)= 0.

We can check that

Definition 6.3. An incidence function 𝛿 is called an identity of the algebra (I(𝑋),) if

𝛿 𝑓 = 𝑓 = 𝑓 𝛿.

There is a unique identity (using Method 6.1), which is the delta function defined by

𝛿(𝑥,𝑦)= {1,if 𝑥 = 𝑦, 0,if 𝑥 𝑦, .

Method 6.1. Given incidence functions 𝑓 and and an equation 𝑔 𝑓 = , we can solve 𝑔 inductively as follows:

𝑔(𝑥,𝑥) = (𝑥,𝑥) 𝑓(𝑥,𝑥),𝑥 𝑋, 𝑔(𝑥,𝑦) = 1 𝑓(𝑦,𝑦) [(𝑥,𝑦)∑︁ 𝑥𝑧𝑦𝑔(𝑥,𝑧)𝑓(𝑧,𝑦)],𝑥 𝑦
from “closer” pairs to “further” pairs, as long as 𝑓(𝑥,𝑥) 0𝑥 𝑋.

Definition 6.4. Given an incidence function 𝑓,

Proposition 6.1. An incidence function 𝑓 is invertible iff 𝑓(𝑥,𝑥) 0 for all 𝑥 𝑋 (using Method 6.1).

Definition 6.5. The zeta function 𝜁 of the poset (𝑋,) is the constant 1 function, i.e.,

𝜁(𝑥,𝑦)= 1,𝑥 𝑦.

Definition 6.6. The Möbius function 𝜇 of the poset (𝑋,) is the inverse of the zeta function 𝜁,

𝜇 = 𝜁1.

[TODO: locally finite?]

Clearly, 𝜁 is invertible. Using Method 6.1, we can solve 𝜇:

𝜇(𝑥,𝑥) = 1,𝑥 𝑋, 𝜇(𝑥,𝑦) = ∑︁ 𝑥𝑧𝑦𝜇(𝑥,𝑧)= ∑︁ 𝑥𝑧𝑦𝜇(𝑧,𝑦),𝑥 𝑦.

Lemma 6.1. Let 𝑃,𝑄 be isomorphic posets, that is, there exists a bijection 𝜙 : 𝑃 𝑄 preserving the partial order:

𝑥,𝑦 𝑃 : 𝑥 𝑃𝑦𝜙(𝑥)𝑄 𝜙(𝑦).

Then, the induced map 𝜙 : I(𝑃)I(𝑄) defined by

𝑓 I(𝑃): [𝜙(𝑓)](𝜙(𝑥),𝜙(𝑦))= 𝑓(𝑥,𝑦)

is an algebra isomorphism.

In particular,

𝛿𝑄 = 𝜙(𝛿𝑃),𝜁𝑄 = 𝜙(𝜁𝑃),𝜇𝑄 = 𝜙(𝜇𝑃).

Proof. [TODO: ]

6.1 Möbius inversion

Given a finite (𝑋,). Let 𝐹(𝑋) denote the set of functions : 𝑋 R. For every 𝑓 𝐹, we define the product as

(𝛼 𝑓)(𝑥)= ∑︁ 𝑥𝑦𝛼(𝑥,𝑦)𝑓(𝑦),𝑥 𝑋, (𝑓 𝛼)(𝑥)= ∑︁ 𝑥𝑦𝑓(𝑥)𝛼(𝑥,𝑦),𝑦 𝑋.

Proposition 6.2. Let (𝑋,) be a finite poset, 𝛼 I(𝑋) be invertible, and 𝑓,𝑔 𝐹(𝑋). Then

𝑔 = 𝛼 𝑓𝑓 = 𝛼1 𝑔, 𝑔 = 𝑓 𝛼𝑓 = 𝑔 𝛼1.

[TODO: associativity]

Theorem 6.1 (Möbius inversion formula). Let (𝑋,) be a finite poset and 𝑓,𝑔 𝐹(𝑋). Then,

𝑔(𝑥)= ∑︁ 𝑥𝑦𝑓(𝑦),𝑥 𝑋𝑓(𝑥)= ∑︁ 𝑥𝑦𝜇(𝑥,𝑦)𝑔(𝑦),𝑥 𝑋; 𝑔(𝑦)= ∑︁ 𝑥𝑦𝑓(𝑥),𝑦 𝑋𝑓(𝑦)= ∑︁ 𝑥𝑦𝑔(𝑥)𝜇(𝑥,𝑦),𝑦 𝑋.

Proof. The LHSs can be written as 𝑔 = 𝜁 𝑓 and 𝑔 = 𝑓 𝜁, respectively. Then, apply the previous proposition. □

6.2 product posets

Let (𝑋1,1) and (𝑋2,2) be two posets. Then the partial order of the product poset (𝑋1 ×𝑋2,) is defined by

(𝑥1,𝑥2)(𝑦1,𝑦2)𝑥1 1𝑥2,𝑦1 2𝑦2.
TEST

Theorem 6.2. Let 𝜇𝑖 be the Möbius function of the poset (𝑋𝑖,𝑖) for 𝑖 = 1,2. Then the Möbius function 𝜇 of (𝑋1 ×𝑋2,)is given by

𝜇((𝑥1,𝑥2),(𝑦1,𝑦2))= 𝜇1(𝑥1,𝑦1)𝜇2(𝑥2,𝑦2).

Proof. Following Equation (6),

𝜇((𝑥1,𝑥2),(𝑦1,𝑦2))= ∑︁ (𝑥1,𝑥2)(𝑧1,𝑧2)(𝑦1,𝑦2)𝜇((𝑥1,𝑥2),(𝑧1,𝑧2)).

We proceed by induction. Note that the length of the longest chain in (𝑥1,𝑥2),(𝑧1,𝑧2) is strictly smaller than the one of (𝑥1,𝑥2),(𝑦1,𝑦2), so we induct on the length.

= ∑︁ (𝑥1,𝑥2)(𝑧1,𝑧2)(𝑦1,𝑦2)𝜇1(𝑥1,𝑧1)𝜇2(𝑥2,𝑧2) = 𝜇1(𝑥1,𝑦1)𝜇2(𝑥2,𝑦2)∑︁ (𝑥1,𝑥2)(𝑧1,𝑧2)(𝑦1,𝑦2)𝜇1(𝑥1,𝑧1)𝜇2(𝑥2,𝑧2)

The sum can be decomposed into a product of two sums because comprises two “independent” conditions, 1 and 2.

= 𝜇1(𝑥1,𝑦1)𝜇2(𝑥2,𝑦2)∑︁ 𝑥11𝑧11𝑦1𝜇1(𝑥1,𝑧1)∑︁ 𝑥22𝑧2𝑦2𝜇2(𝑥2,𝑧2) = 𝜇1(𝑥1,𝑦1)𝜇2(𝑥2,𝑦2)(𝜇1 𝜁)(𝑥1,𝑦1)·(𝜇2 𝜁)(𝑥2,𝑦2)= 𝜇1(𝑥1,𝑦1)𝜇2(𝑥2,𝑦2).

6.3 [TODO: number theoretical]

Definition 6.7. The poset (Z+,|) with the partial order of divisibility is locally finite. Let 𝜇 be the Möbius function of (Z+,|). Then, the Möbius function of number theory is defined by

𝜇(𝑛):= 𝜇(1,𝑛).

Proposition 6.3. If 𝑎|𝑏, then 𝜇(𝑎,𝑏)= 𝜇(1,𝑏/𝑎).

Proof. Using the poset isomorphism 𝑥𝑥/𝑎 and Lemma 6.1, we have [TODO: ]. □

By Equation 6, we have

Proposition 6.4.

𝜇(1) = 1, 𝜇(𝑛) = ∑︁ 𝑑|𝑛,𝑑≠𝑛𝜇(𝑑),𝑛 2.

Derivation 6.1. Lemma 6.1

Its Möbius function is partially given by

𝜇(𝑚,𝑚) = 1,𝑚 Z+, 𝜇(1,𝑛) = ∑︁ 𝑚Z+,𝑚|𝑛,𝑚≠𝑛 𝜇(1,𝑚)

Theorem 6.3. The Möbius function of number theory is explicitly given by

𝜇(𝑛)= {1, if 𝑛 = 1 (1)𝑗,if 𝑛 is a product of 𝑗 distinct, non-repeated primes 0, if 𝑛 has repeated prime factors.

Proposition 6.5 (Möbius inversion, number theory). Given functions 𝑓,𝑔 : Z±C,

𝑔(𝑛)= ∑︁ 𝑑|𝑛𝑓(𝑑),𝑛 Z+𝑓(𝑛)= ∑︁ 𝑑|𝑛𝑔(𝑑)𝜇 (𝑛 𝑑),𝑛 Z+.

Proof. 𝑔 = 𝑓 𝜁𝑓 = 𝑔 𝜇 by Möbius inversion 6.1. 𝜇(𝑑,𝑛)= 𝜇(𝑛/𝑑). □

Theorem 6.4. The Euler totient function 𝜙 (Defn. 5.5) and Möbius function 𝜇 are related by

𝜑(𝑛) 𝑛 = ∑︁ 𝑑|𝑛𝜇(𝑑) 𝑑 ,𝑛 Z+.

Proof. Let Φ(𝑑,𝑛):= {𝑎 [𝑛]: gcd (𝑎,𝑛)= 𝑑}. [TODO: Φ(𝑑,𝑛)Φ(1,𝑛/𝑑) is a bijection] [𝑛]= 𝑑|𝑛Φ(𝑑,𝑛). Thus, 𝑛 = 𝑑|𝑛|Φ(1,𝑛/𝑑)|= 𝜑 𝜁𝜑 = 𝑛𝜇. □

6.4 circular permutations with repetition

Definition 6.8. Let 𝑀 = {𝑎1,,𝑎𝑘}, where 𝑘 2, be a multiset of 𝑘 distinct elements. A circular 𝑛-permutation of 𝑀 is an arrangement of 𝑛 elements of 𝑀 on a circle.

Theorem 6.5. The number of circular 𝑛-permutations of a set of 𝑘 distinct objects with repetition allowed is

1 𝑛∑︁ 𝑑|𝑛𝑘𝑑𝜑 (𝑛 𝑑) = 1 𝑛∑︁ 𝑎|𝑛𝑘𝑛/𝑎𝜑 (𝑎).

Proof. [TODO: ]

Theorem 6.6. Let 𝑀 be a multiset of type (𝑛1,,𝑛𝑘) with 𝑚 = gcd (𝑛1, ,𝑛𝑘)and 𝑛 = 𝑛1 + +𝑛𝑘. Then the number of circular permutations of 𝑀 is

1 𝑛∑︁ 𝑑|𝑚𝜑(𝑑)( 𝑛/𝑑 𝑛1/𝑑,,𝑛𝑘/𝑑).

Proof. [TODO: ]

7 Linear Recurrence

Definition 7.1. A sequence (𝑥𝑛; 𝑛 0) of numbers is said to satisfy a linear recurrence relation of order 𝑘 if there exist functions 𝛼’s, with 𝛼𝑘(𝑛) 0, and a function 𝛽 such that

𝑛 𝑘 : 𝑥𝑛 = 𝛼1(𝑛)𝑥𝑛1 +𝛼2(𝑛)𝑥𝑛2 + +𝛼𝑘(𝑛)𝑥𝑛𝑘 +𝛽(𝑛).

homogeneity

The linear recurrence relation in (7.1) is said to be homogeneous if 𝛽(𝑛)= 0 for all 𝑛 𝑘 and non-homogeneous otherwise.

constant

A linear recurrence relation with constant coefficients has constant 𝛼 functions (excluding 𝛽).

solution

A solution of the relation is a sequence satisfying the relation. A general solution is a solution with parameters 𝑐1,,𝑐𝑘 and functions 𝑢𝑛:

𝑥𝑛 = 𝑢𝑛(𝑐1,,𝑐𝑘),𝑛 0,

provided that for arbitrary initial values 𝑢0,,𝑢𝑘1 there exist constants 𝑐1,,𝑐𝑘 such that (7.1) is the (unique) solution satisfying the initial conditions.

P.S. There can be many sequences satisfying the same linear recurrence relation. When we talk about the set of solutions / general solution, we refer to the case where the initial values are unspecified.

7.1 structural theorems

Notation 1. Throughout this section, the vector notation (𝑎) will be used to write sequences ((𝑎𝑛; 𝑛 0)).

Let S denote the infinite-dimensional vector space of all sequences under the ordinary addition and scalar multiplication.

Theorem 7.1. Let H𝑘 denote the set of solutions of a homogeneous linear recurrence relation. Then, H𝑘 is 𝑘-dimensional subspace of S.

Consequently, if 𝑢(1),,𝑢(𝑘)are linearly independent solutions in H𝑘, then all solutions in H𝑘 can be written as

𝑥 = 𝑐1𝑢(1)+ +𝑐 𝑘𝑢(𝑘).

Proof. It is easy to check that H𝑘 is a vector subspace.

To show that H𝑘 is 𝑘-dimensional, we just need to find an isomorphism between H𝑘 and R𝑘. Consider the projection 𝜋 : SR𝑘 defined by 𝑥(𝑥0,𝑥1,,𝑥𝑘1). Then, 𝜋|H𝑘 is an isomorphism. □

Corollary 7.1. Given a set of specific solutions of a homogeneous linear recurrence relation, any linear combination of them is also a solution.

Theorem 7.2. Let N𝑘 denote the set of solutions of a non-homogeneous linear recurrence relation. Let H𝑘 be the set of solutions of the corresponding homogeneous linear recurrence. Then, N𝑘 is a translate of H𝑘 in S, i.e.,

N𝑘 = 𝑛+H𝑘 = {𝑛+ : H𝑘}.

(a.k.a. a 𝑘-dimensional affine subspace of S)

Geometrical intepretation: not through the origin

Proof. Let 𝑣 N𝑘. Show that 𝑛𝑣 H𝑘 (𝛽(𝑛) is cancelled). This proves that N𝑘 𝑛+H𝑘.

Conversely, show that, for any H𝑘, 𝑛 + is a solution. □

Method 7.1. By Theorem 7.2, we can summarize a general framework for finding the solutions to a nonhomogeneous linear recurrence relation:

1.
Find the general solution of the corresponding homogeneous relation (special cases in Sec. 7.2.1).
2.
Find a particular solution of the nonhomogeneous relation (special cases in Sec. 7.2.1)
3.
Sum the two to derive the general solution of the nonhomogeneous relation. Determine values of the constants (parameters) that satisfy the initial conditions.

Definition 7.2. Let (𝑢𝑛(1)),,(𝑢𝑛(𝑘)) be 𝑘 solutions in H𝑘. Then the Wronskian 𝑊𝑘(𝑛) of the solutions is defined as

𝑊𝑘(𝑛):= det [ 𝑢𝑛(1) 𝑢𝑛(𝑘) 𝑢𝑛+𝑘1(1) 𝑢𝑛+𝑘1(𝑘) ],

i.e., the window of 𝑘 items of the solutions starting from position 𝑛.

7.2 constant coefficients

Derivation 7.1. Consider linear recurrence relation of order 1: 𝑥𝑛 = 𝑞𝑥𝑛1,𝑛 1. It is heuristic that the solution of it is a geometric sequence, 𝑥𝑛 = 𝑞𝑛. This hints that higher order relation with constant coefficients may have some solutions of the same form (not unique).

7.2.1 homogeneous

In this section, we consider homogeneous linear recurrence relations with constant coefficients.

Definition 7.3. Let (𝑥𝑛; 𝑛 0) be a homogeneous linear recurrence relation of order 𝑘 with constant coefficients:

𝑥𝑛 = 𝛼1𝑥𝑛1 + +𝛼𝑘𝑥𝑛𝑘,𝛼𝑘 0.

Its characteristic equation is formed by substituting 𝑥𝑛𝑖 = 𝑟𝑘𝑖, defined as

𝑟𝑘 𝛼 1𝑟𝑘1 𝛼 𝑘 = 0.

Its characteristic polynomial is defined as

𝑃(𝑟)= 𝑟𝑘 𝛼 1𝑟𝑘1 𝛼 𝑘.

Theorem 7.3.

(a)
Let 𝑞 0. The sequence 𝑥𝑛 = 𝑞𝑛,𝑛 0 is a solution of (7.3) iff 𝑞 is a root of the characteristic equation (7.3).
(b)
Suppose the nonzero roots of characteristic equation (7.3) are 𝑘 distinct values 𝑞1,,𝑞𝑘, then the general solution of (7.3) is given by

𝑥𝑛 = 𝑐1𝑞1𝑛 + +𝑐 𝑘𝑞𝑘𝑛,𝑛 0.

Proof.

(a)
Easy to check (Note that 𝑞 0 is important).
(b)
Check the condition in the definition: [TODO: ]

Example 7.1. Fibonacci sequence: 𝑓0 = 0,𝑓1 = 1, 𝑓𝑛 = 𝑓𝑛1 +𝑓𝑛2 for 𝑛 2.

The characteristic equation is 𝑟2 𝑟 1 = 0, with two roots 1±5 2 . By Theorem 7.3 (b), the general solution is

𝑓𝑛 = 𝑐1 ( 1 +5 2 ) 𝑛 +𝑐 2 ( 1 5 2 ) 𝑛.

Plugging this into 𝑓0 and 𝑓1, we can obtain 𝑐1 = 1/5,𝑐2 = 1/5. Thus, the general formula for the Fibonacci sequence is

𝑓𝑛 = 1 5 (1 +5 2 ) 𝑛 1 5 (1 5 2 ) 𝑛.

Definition 7.4. A number 𝑞 is a root of multiplicity 𝑚 of a polynomial if it occurs 𝑚 times in the multiset of roots. In other words, the polynomial can be decomposed as (𝑟 𝑞)𝑚𝑠(𝑟) where 𝑠(𝑟) is a polynomial with 𝑠(𝑞) 0.

Theorem 7.4.

Proof.

(a)
Since 𝑛𝑗 can be represented as a linear combination of falling factorial polynomial (𝑛)𝑗’s (Definition 9.8, Theorem 9.5), and that linear combinations of solutions are also solutions (the solution space is a vector space), it suffices to prove the following claim:

For every 0 𝑗 𝑚1, 𝑥𝑛 = (𝑛)𝑗𝑞𝑛 is a solution.

Plugging 𝑥𝑛 = (𝑛)𝑗𝑞𝑛 into

𝑥𝑛 𝛼1𝑥𝑛1 𝛼𝑘𝑥𝑛𝑘 := 𝑅(𝑞),

we have (Hint, the coefficients here do not matter much)

𝑅(𝑞)= ∑︁ 𝑖=0𝑘𝛼 𝑖(𝑛𝑖)𝑗𝑞𝑛.

Note that (𝑞𝑛)(𝑗)= (𝑛)𝑗𝑞𝑛𝑗 for 𝑛 𝑗. If we lift the degree of 𝑃(𝑟) by multiplying it with 𝑟𝑛𝑘, we have

(𝑟𝑛𝑘𝑃(𝑟))(𝑗)= ∑︁ 𝑖=0𝑘𝛼 𝑖(𝑛𝑖)𝑗𝑞𝑛𝑗.

Thus, 𝑅(𝑞)= 𝑞𝑗(𝑟𝑛𝑘𝑃(𝑟))(𝑗)|𝑟=𝑞.

Since the root 𝑞 has multiplicity 𝑚, we can write the characteristic polynomial (7.3) as 𝑃(𝑟)= (𝑟 𝑞)𝑚𝑄(𝑟) for some polynomial 𝑄(𝑟) of degree 𝑘 𝑚. Then,

𝑅(𝑞) = 𝑞𝑗[𝑟𝑛𝑘(𝑟 𝑞)𝑚𝑄(𝑟)](𝑗)| 𝑟=𝑞 = 𝑞𝑗 ∑︁ 𝑖=0𝑗(𝑗 𝑖)[(𝑟 𝑞)𝑚](𝑖)| 𝑟=𝑞[𝑟𝑛𝑘𝑄(𝑟)](𝑗𝑖)| 𝑟=𝑞 = 𝑞𝑗 ∑︁ 𝑖=0𝑗(𝑗 𝑖)(𝑚)𝑖(𝑟 𝑞)𝑚𝑖 | 𝑟=𝑞[𝑟𝑛𝑘𝑄(𝑟)](𝑗𝑖)| 𝑟=𝑞 = 0.
(b)
[TODO: ]

7.2.2 nonhomogeneous

In this section, we consider nonhomogeneous linear recurrence relations with constant coefficients.

Theorem 7.5 (coefficient variation methods). Given a recurrence relation of the second order of the form

𝑥𝑛 = 𝛼1𝑥𝑛1 +𝛼2𝑥𝑛2 +𝑐𝑞𝑛,

where 𝛼1,𝛼2,𝑐,𝑞are constants. Let 𝑞1,𝑞2 be roots of its characteristic euqtion 𝑥2 𝛼1𝑥 𝛼2 = 0. Then (7.5) has a particular solution of the following forms, where 𝐴 is a constant to be determined:

(a)
If 𝑞 𝑞1,𝑞 𝑞2, then 𝑥𝑛 = 𝐴𝑞𝑛.
(b)
If 𝑞 = 𝑞1,𝑞 𝑞2, then 𝑥𝑛 = 𝐴𝑛𝑞𝑛.
(c)
If 𝑞 = 𝑞1 = 𝑞2, then 𝑥𝑛 = 𝐴𝑛2𝑞𝑛.

Theorem 7.6 (not required). Given a recurrence relation of the second order

𝑥𝑛 = 𝛼1𝑥𝑛1 +𝛼2𝑥𝑛2 +𝛽(𝑛),𝑛 2,

where 𝛽(𝑛)is a polynomial with degree 𝑘.

Example 7.2. Solve 𝑎𝑛 = 4𝑎𝑛1 4𝑎𝑛2 +3𝑛+1 with 𝑎0 = 1,𝑎1 = 2.

Since 4 +(4)= 0 1, we can apply case (a). 3𝑛+1 has degree 1, so a particular solution is 𝑎𝑛 = 𝐴0 +𝐴1𝑛. We can solve that 𝐴1 = 3,𝐴0 = 13.

The corresponding homogeneous relation has the general solution 𝑐02𝑛 +𝑐1𝑛2𝑛.

The general solution of the nonhomogenous relation is then 𝑎𝑛 = 13 +3𝑛+𝑐02𝑛 +𝑐1𝑛2𝑛. Plugging this into the intial conditions, we can obtain 𝑐0 = 12,𝑐1 = 5.

8 Generating Functions

8.1 ordinary generating function

Definition 8.1. The ordinary generating function of an infinite sequence 𝑎0,𝑎1, is the power series

𝐴(𝑥; 𝑎𝑛)= ∑︁ 𝑖=0𝑎 𝑖𝑥𝑖.

The power series is said to have a rational form if, within its domain of convergence, it can be written as a quotient of two polynomials.

Remark 2 (Useful series).

Theorem 8.1. Let 𝑚be a positive integer. Let 𝑎𝑛 be the number of ways to distribute 𝑛 indistinguishable balls into 𝑚 distinct boxes. Then,

∑︁ 𝑛=0𝑎 𝑛𝑥𝑛 = (∑︁ 𝑛=0𝑥𝑛) 𝑚 = 1 (1 𝑥)𝑚

An algebraic proof is already provided in Remark 2.

combinatorial proof. The meaning of the LHS is clear.

This precisely describe the process of enumerating all the ways to distribute the balls. □

Theorem 8.2 (generalized). Let 𝑎𝑛 be the number of ways to distribute 𝑛 indistinguishable balls into 𝑚 distinguishable boxes, such that the number of balls held in the 𝑘-th box is allowed in a subset 𝐶𝑘 Z0. Then,

∑︁ 𝑛=0𝑎 𝑛𝑥𝑛 = (∑︁ 𝑛𝐶1𝑥𝑛) (∑︁ 𝑛𝐶𝑚 𝑥𝑛) .

Example 8.1. Let multiset 𝑀 = {·𝑥1,·𝑥2,·𝑥3,·𝑥4}. What is the generating function for (𝑎𝑛; 𝑛 0), where 𝑎𝑛 is the number of 𝑛-combinations of 𝑀 satisfying the following conditions? (1) 𝑥1 occurs 1, 3, or 11 times, (2) 𝑥2 occurs at least 10 times, (3) 𝑥3 occurs a multiple of 3 number of times.

(𝑥1 +𝑥3 +𝑥11) ( 𝑖=10𝑥𝑖) (𝑖=0𝑥3𝑖) (𝑖=0𝑥𝑖) = (𝑥1+𝑥3+𝑥11)𝑥10 (1𝑥)2(1𝑥3) .

8.2 exponential generating function

Definition 8.2. The exponential generating function of an infinite sequence 𝑎0,𝑎1, is the power series

𝐸(𝑥; 𝑎𝑛)= ∑︁ 𝑛=0𝑎𝑛 𝑛! 𝑥𝑛.

The power series is said to have a rational form if, within its domain of convergence, it can be written as a quotient of two polynomials.

Remark 3 (Useful series).

Theorem 8.3. Let 𝑚be a positive integer. Let 𝑎𝑛 be the number of ways to distribute 𝑛 distinct balls into 𝑚 distinct boxes. Then,

∑︁ 𝑛=0𝑎 𝑛𝑦𝑛 𝑛! = (∑︁ 𝑛=0𝑦𝑛 𝑛! ) 𝑘 = 𝑒𝑘𝑥

Proof. This is related to Combination with distinguishable boxes 3.7. Easy to prove algebraically. □

Theorem 8.4 (generalized). Let 𝑎𝑛 be the number of ways to distribute 𝑛 distinct balls into 𝑚 distinct boxes, such that the number of balls held in the 𝑘-th box is allowed in a subset 𝐶𝑘 Z0. Then,

∑︁ 𝑛=0𝑎 𝑛𝑥𝑛 𝑛! = (∑︁ 𝑛𝐶1𝑥𝑛 𝑛! ) (∑︁ 𝑛𝐶𝑚 𝑥𝑛 𝑛! ) .

Example 8.2. Determine the number of ways of 𝑛-permutations of a multiset 𝑀 = {∞𝑎,∞𝑏,∞𝑐} such that the number of 𝑎 is even and the number of 𝑏 is at least one.

𝐸(𝑥) = (∑︁ 𝑖=0 𝑥2𝑖 (2𝑖)! ) (∑︁ 𝑖=1𝑥𝑖 𝑖! ) (∑︁ 𝑖=0𝑥𝑖 𝑖! ) = 𝑒𝑥 +𝑒𝑥 2 (𝑒𝑥 1)𝑒𝑥 = 1 2(𝑒3𝑥 𝑒2𝑥 +𝑒𝑥 1) = 1 2 +∑︁ 𝑖=03𝑛 2𝑛 +1 2 ·𝑥𝑛 𝑛! . Thus, 𝑎𝑛 = 3𝑛2𝑛+1 2 for 𝑛 1 and 𝑎0 = 1 2 +3020+1 2 = 0.

8.3 connection with linear recurrence

Method 8.1. Suppose we have a recurrence relation 𝑎𝑛 of degree 𝑘. To solve it,

1.
We can write the generating function of 𝑎𝑛 as
𝐴(𝑥)= first 𝑘 terms +∑︁ 𝑛=𝑘(recurrence relation for 𝑎 𝑛)𝑥𝑛.
2.
Since the relation is linear, the series can be further written in terms of 𝐴(𝑥) (and minor terms).
𝐴(𝑥)= first 𝑘 terms +some polynomial ·𝐴(𝑥)+some polynomial.

This gives a formula of 𝐴(𝑥) as a quotient of two polynomials. (E.g., Theorem 8.5)

3.
Try to represent 𝐴(𝑥) as a linear combination of power series (E.g., Remark 2 and Proposition 8.1) so that we can have the explicit formula of 𝑎𝑛.

Theorem 8.5.

Proof.

Proposition 8.1 (partial fractions).

(a)
If 𝑃(𝑥) is a polynomial of degree less than 𝑘, then
𝑃(𝑥) (1 𝑎𝑥)𝑘 = ∑︁ 𝑖=1𝑘 𝐴𝑖 (1 𝑎𝑥)𝑖,

where 𝐴1,,𝐴𝑘 are constants to be determined.

(b)
If 𝑃(𝑥) is a polynomial of degree less than 𝑝 +𝑞 +𝑟, then
𝑃(𝑥) (1 𝑎𝑥)𝑝(1 𝑏𝑥)𝑞(1 𝑐𝑥)𝑟 = 𝐴1(𝑥) (1 𝑎𝑥)𝑝 + 𝐴2(𝑥) (1 𝑏𝑥)𝑞 + 𝐴3(𝑥) (1 𝑐𝑥)𝑟,

where 𝐴1(𝑥),𝐴2(𝑥),𝐴3(𝑥) are polynomials of degree 𝑞 +𝑟, 𝑝 +𝑟, 𝑝 +𝑞, respectively.

No proof.

8.4 Catalan Sequence

Let 𝐶𝑛 denote the number of ways to divide a labeled convex (𝑛+2)-polygon into triangles. We consider a partition (i.e., into distinct sets) of all divisions according to the first node connected with node 1 except node 2.

Suppose 𝑘 > 2 is the first node connected to node 1, then there must be a segment between node 2 and node 𝑘. This leaves a (𝑘 1) polygon and an (𝑛+4 𝑘) polygon free to divide.

PIC

Figure 6: (n+2)-polygon

The number of divisions under this category is 𝐶𝑘3𝐶𝑛+2𝑘.

We thus obtain the recurrence relation

𝐶𝑛 = ∑︁ 𝑘=3𝑛+2𝐶 𝑘3𝐶𝑛+2𝑘 = ∑︁ 𝑘=0𝑛1𝐶 𝑘𝐶𝑛1𝑘.

Here, we define 𝐶0 = 1 (the triangle collapse to a segment for a 2-polygon).

Consider the generating function 𝐹(𝑥)= 𝑛=0𝐶𝑛𝑥𝑛.

9 Special Counting Sequences

9.1 Catalan Sequence

Definition 9.1. The Catalan1 sequence is

𝐶𝑛 = 1 𝑛+1( 2𝑛 𝑛) ,𝑛 0.

The 𝑛-th term 𝐶𝑛 is called the 𝑛-th Catalan number.

Theorem 9.1. 𝐶𝑛 is the number of lattice paths along the edges of an 𝑛×𝑛 grid (𝑛×𝑛 square cells and (𝑛+1)2 vertices) from (0,0) to (𝑛,𝑛), which do not pass above the diagonal.

Proof. Any lattice path from (0,0) to (𝑛,𝑛) have exactly 𝑛 right moves and 𝑛 up moves.

Let call a path “bad” if it passes above the diagonal. This happens iff it touches the next higher diagonal (red). Consider the first contact point of a bad path and the next higher diagonal. The portion of the path before (including) this point (black solid) has one more up move than right moves, and the portion after (including) it (red dotted) has one more right move than up moves. If we flip the latter portion about the next higher diagonal (red solid), the entire path has the same number of total moves, but with two more up moves than right moves. In conclusion, if a path is bad then it has (𝑛1) right moves and (𝑛+1) up moves.

Conversely, if a path has (𝑛1) right moves and (𝑛+1) up moves then it ends up at (𝑛1,𝑛+1). To reach this ending point, the path must cross the main diagonal. Therefore, having (𝑛1) right moves and (𝑛+1) up moves is the exact condition for the path to be bad.

Thus, 𝐶𝑛 =( 2𝑛 𝑛) ( 2𝑛 𝑛1) = 1 𝑛+1( 2𝑛 𝑛) .

PIC
Figure 7: Extracted from Wikipedia
[TODO: How to think of this.]
[TODO: Interpreation of the factor 1 𝑛+1 ]

Theorem 9.2. 𝐶𝑛 is the number of words 𝑎1𝑎2𝑎2𝑛 of length 2𝑛consisting of exactly 𝑛positive ones and exactly 𝑛 negative ones and satisfying

𝑎1 + +𝑎𝑖 0,𝑖 2𝑛.

Proof. Similar. □

9.1.1 [TODO: A geometric example]

9.2 Difference sequences

Definition 9.2. The first order difference sequence of a sequence (𝑎𝑛; 𝑛 0) defined as

Δ𝑎𝑛 := (Δ𝑎)𝑛 = 𝑎𝑛+1 𝑎𝑛,𝑛 0.

The 𝑝-th order difference sequence Δ𝑝𝑎𝑛 is the difference sequence of the sequence Δ𝑝1𝑎𝑛.

Viewing sequences as functions : Z0 C, we can represent the difference sequence in the following way.

Definition 9.3. Let S denote the vector space of functions 𝑓 : Z0 C with the ordinary addition and scalar multiplication. The difference operator Δ : SS is defined by

(Δ𝑓)(𝑛)= 𝑓(𝑛+1)𝑓(𝑛),𝑛 0.

Proposition 9.1. The difference operator Δ : SS is a linear map.

Definition 9.4. The difference table of a sequence 𝑓(𝑛) is the array

0 : 𝑓(0) 𝑓(1) 𝑓(2) 𝑓(3) 1 : (Δ𝑓)(0) (Δ𝑓)(1) (Δ𝑓)(2) 2 : (Δ2𝑓)(0) (Δ2𝑓)(1) 3 : (Δ3𝑓)(0)

Proposition 9.2. Let 𝑓(𝑛),𝑛 0 and 𝑔(𝑛),𝑛 0 be two sequences. If (Δ𝑝𝑓)(0)= (Δ𝑝𝑔)(0) for all 𝑝 0, then 𝑓(𝑛)𝑔(𝑛).

Proof: trivial.

Theorem 9.3. For every 𝑓 S, the 𝑝-th order difference sequence Δ𝑝𝑓 has the form

(Δ𝑝𝑓)(𝑛)= ∑︁ 𝑘=0𝑝(1)𝑝𝑘(𝑝 𝑘)𝑓(𝑛+𝑘),𝑛 0.

Proof. Observe that (Δ𝑝𝑓)(𝑛) can be represented a linear combination 0-th order terms 𝑓(𝑛),𝑓(𝑛+1),,𝑓(𝑛+𝑝). Specifically, every “computation path” from 𝑓(𝑛+𝑘) to (Δ𝑝𝑓)(𝑛) contribute ±𝑓(𝑛+𝑘) to the sum. E.g.,

PIC
Figure 8: Two paths from 𝑓(2) to (Δ3𝑓)(0)

The coefficient is determined by the number of “right moves”, which flip the sign. For 𝑓(𝑛+𝑘), it takes exactly 𝑝 moves, of which 𝑖 moves are to the left and (𝑝 𝑘) moves are to the right. Thus, the final coefficient for 𝑓(𝑛+𝑘) is (𝑝 𝑘)(1)𝑝𝑘. □

Theorem 9.4. The difference table of a sequence 𝑓(𝑛)can be determined by its 0-th diagonal sequence (Δ0𝑓)(0),(Δ1𝑓)(0),(Δ2𝑓)(0),. Specially, the sequence itself can be determined by

𝑓(𝑛)= ∑︁ 𝑘=0𝑛(𝑛 𝑘)(Δ𝑘𝑓)(0)= ∑︁ 𝑘=0(𝑛 𝑘)(Δ𝑘𝑓)(0),𝑛 0.

This is an analogy with Taylor expansion.

Proof. Similar combinatorial explanation:

PIC
Figure 9: Proof for Thm 9.4

Corollary 9.1. Let 𝑓(𝑛),𝑛 0 be a sequence. Its partial sum can be written as

∑︁ 𝑘=0𝑛𝑓(𝑘)= ∑︁ 𝑘=0𝑛(𝑛+1 𝑘 +1)(Δ𝑘𝑓)(0),𝑛 0.

Proof. Use 𝑘=𝑖𝑛(𝑘 𝑖) =( 𝑛+1 𝑘+1) . □

Definition 9.5. A sequence 𝑓(𝑛),𝑛 0 is called a polynomial sequence of degree 𝑝 if it has the form

𝑓(𝑛)= 𝛼𝑝𝑛𝑝 +𝛼 𝑝1𝑛𝑝1 + +𝛼 0,𝑛 0,

where 𝛼’s are constants and 𝛼𝑝 0.

Lemma 9.1. For every polynomial sequence 𝑓(𝑛),𝑛 0 of degree 𝑝, the (𝑝 +1)-th order difference sequence Δ𝑝+1𝑓 0.

Proof. The 1st order difference sequence of a polynomial sequence of degree 𝑝 is a polynomial sequence of degree at most 𝑝 1: This is because (𝑛+1)𝑘 𝑛𝑘 is a polynomial of degree 𝑘 1.

Specially, the 1st order difference sequence of a polynomial sequence of degree 0 is a zero sequence: 𝛼0[(𝑛+1)0 𝑛0]= 0. □

Corollary 9.2. Let 𝑓(𝑛)= 𝑛𝑝,𝑛 0, where 𝑝 is a constant. Then,

∑︁ 𝑘=0𝑛𝑘𝑝 = ∑︁ 𝑘=0𝑝(𝑛+1 𝑘 +1)(Δ𝑘𝑓)(0),

which is a polynomial of degree 𝑝 +1.

Proof. Discuss 𝑛 𝑝 and 𝑛 < 𝑝. □

Example 9.1. For 𝑓(𝑛)= 𝑛2,𝑛 0 and 𝑓(𝑛)= 𝑛3,𝑛 0.

0 1 4 13 2 , 0 1 8 27 1 7 19 6 12 6 .

Thus, 𝑘=0𝑛𝑘2 =( 𝑛+1 2) +2(𝑛+1 3) = (𝑛+1)𝑛(2𝑛+1) 6 , 𝑘=0𝑛𝑘3 =( 𝑛+1 2) +6(𝑛+1 3) +6(𝑛+1 4) = ((𝑛+1)𝑛 2 ) 2.

9.3 Stirling numbers

They are introduced by James Stirling in a purely algebraic setting (as connnection coefficients between monomials and falling polynomials) in 1730 and were rediscovered and given a combinatorial meaning by Masanobu Saka in 1782 [Mansour and Schork2015]. In the following, the definitions will be given by combinatorics and their algebraic meaning will be stated as theorems.

9.3.1 of the second kind

Definition 9.6. The Stirling numbers of the second kind, written 𝑆(𝑛,𝑘), are defined as the number of ways to partition a set of 𝑛 labeled objects into 𝑘 nonempty unlabeled subsets.

We have 𝑆(𝑝,𝑘)= 0 for 𝑘 > 𝑝 and 𝑆(𝑛,0)= 0 for 𝑛 1 and assume 𝑆(0,0)= 1.

Definition 9.7. The 𝑛-th Bell number 𝐵𝑛 is the number of partitions of an 𝑛-set into nonempty unlabeled subsets, that is,

𝐵𝑛 = ∑︁ 𝑘=0𝑛𝑆 𝑛,𝑘.

Proposition 9.3. The Stirling numbers of the second kind satisfy the recurrence relation

𝑆(𝑛,𝑘)= 𝑆(𝑛1,𝑘 1)+𝑘𝑆(𝑛1,𝑘), 𝑛 > 𝑘 1

with initial conditions

𝑆(0,0)= 𝑆(𝑛,𝑛)= 1 𝑛 0,𝑆(𝑛,0)= 0 𝑛 1,𝑆(𝑛,1)= 1 𝑛 1.

Proof. The initial conditions are easy to show.

To prove the recurrence, consider how to put the 𝑛-th object based on partition of the first (𝑛1) objects:

Definition 9.8. Let 𝑝 be a non-negative integer. The falling factorial polynomials in variable 𝑥 R are defined by (refers to Defn 3.12)

(𝑥)𝑝 := 𝑥𝑝̲ := 𝑝!(𝑥 𝑝) = {1, if 𝑝 = 0 𝑥(𝑥 1)(𝑥 𝑝 +1),if 𝑝 1.

The rising factorial polynomials are defined by

(𝑥)𝑝 := 𝑥𝑝¯ := 𝑥(𝑥 +1)(𝑥 +𝑝 1).

Theorem 9.5 (algebraic meaning). Let 𝑥 R and 𝑝 be a non-negative integer.

𝑥𝑝 = ∑︁ 𝑘=0𝑝(𝑥 𝑘)𝑘!𝑆(𝑝,𝑘)= ∑︁ 𝑘=0𝑝𝑆(𝑝,𝑘)(𝑥) 𝑘.

combinatorial, when 𝑥 =: 𝑛 N. 𝑘!𝑆(𝑝,𝑘) is the number of ways to partition a set of 𝑝 labeled objects into 𝑘 nonempty labeled subsets.

Consider partitioning a set of 𝑛 labeled objects into 𝑝 labeled subsets of which 𝑘 are nonempty. Then the number of ways of such partitions is (𝑛 𝑘)𝑘!𝑆(𝑝,𝑘).

Then, the number of ways to deliver 𝑝 labeled objects into 𝑛 labeled boxes (with empty boxes allowed) can be written as

𝑛𝑝 = ∑︁ 𝑘=0𝑝(𝑛 𝑘)𝑘!𝑆(𝑝,𝑘).

Theorem 9.6. Let 𝑥 R and 𝑝 be a non-negative integer. Then Stirling numbers of the second kind satisfy

𝑆(𝑝,𝑘)= (Δ𝑘𝑛𝑝)(0) 𝑘! ,𝑝 𝑘 0.

Proof. The middle term in Theorem 9.5 takes the same form as Equation (9.4). Since the inverse of the process of taking the difference is the process of taking the sum (refer to the difference table), (Δ𝑘𝑛𝑝)(0)= 𝑘!𝑆(𝑝,𝑘). □

9.3.2 of the first kind

Definition 9.9. The (unsigned) Stirling numbers of the first kind, written 𝑐(𝑝,𝑘), are defined as the number of permutations of 𝑝 elements with 𝑘 disjoint circles (Section 3.7.2).

We have 𝑐(𝑝,0)= 0 for all 𝑝 1 and assume 𝑐(0,0)= 1.

Proposition 9.4. For any 𝑝 > 𝑘 1, the (unsigned) Stirling numbers of the first kind satisfy

𝑐𝑝,𝑘 = 𝑐𝑝1,𝑘1 +(𝑝 1)𝑐𝑝1,𝑘,

with initial conditions 𝑐(𝑝,𝑝)= 1 for all 𝑝.

Proof. The permutations of a 𝑝-set can be divided into two kinds:

Theorem 9.7 (algebraic meaning). The signed Stirling numbers of the first kind are the coefficients of 𝑛𝑘 in the expansion the falling factorial (𝑛)𝑝:

(𝑛)𝑝 = ∑︁ 𝑘=0𝑝𝑠(𝑝,𝑘)𝑛𝑘.

It is easy to see from the expansion of (𝑛)𝑝 that the coefficient of 𝑛𝑘 is a product of (𝑝 𝑘)negative numbers, so the sign of 𝑠(𝑝,𝑘) is (1)𝑝𝑘. The unsigned Stirling numbers of the first kind satisfy

𝑐(𝑝,𝑘)= |𝑠(𝑝,𝑘)|= (1)𝑝𝑘𝑠(𝑝,𝑘).
[TODO: rising polynomials]

Definition 9.10. The elementary symmetric polynomials 𝑠0,𝑠1,,𝑠𝑝 in 𝑝 variables 𝑥1,,𝑥𝑝 are defined as the coefficients in the of 𝑦𝑘 in the expansion of the polynomial (𝑦 +𝑥1)(𝑦 +𝑥𝑝), i.e.,

(𝑦 +𝑥1)(𝑦 +𝑥𝑝)= ∑︁ 𝑘=0𝑝𝑠 𝑝𝑘(𝑥1,,𝑥𝑘)𝑦𝑘.

Proof. [TODO: ]

Plugging 𝑥1 = 0,𝑥2 = 1,,𝑥𝑝 = 𝑝 +1 in the polynomial, we have

𝑦(𝑦 1)(𝑦 𝑝 +1) = (𝑦)𝑝 = ∑︁ 𝑘=0𝑝𝑠 𝑝𝑘(0,1,,𝑝 +1)𝑦𝑘 = ∑︁ 𝑘=0𝑝(1)𝑝𝑘𝑠 𝑝𝑘(0,1,,𝑝 1)𝑦𝑘.

Thus, we have 𝑠(𝑝,𝑘)= (1)𝑝𝑘𝑠𝑝𝑘(0,1,,𝑝 1). We define 𝑐(𝑝,𝑘)= 𝑠𝑝𝑘(0,1,,𝑝 1).

Proposition 9.5. Let 𝑝 𝑗 be non-negative integers.

∑︁ 𝑘=𝑗𝑝𝑆(𝑝,𝑘)𝑠(𝑘,𝑗)= 𝛿 𝑝𝑗 = ∑︁ 𝑘=𝑗𝑝𝑠(𝑝,𝑘)𝑆(𝑘,𝑗).

Proof. Let 𝑥 R be a variable. Fix 𝑝. The following holds for any 𝑥:

𝑥𝑝 = ∑︁ 𝑘=0𝑝𝑆(𝑝,𝑘)(𝑥) 𝑘 = ∑︁ 𝑘=0𝑝𝑆(𝑝,𝑘)∑︁ 𝑗=0𝑘𝑠(𝑘,𝑗)𝑥𝑗.

Thus, for any 0 𝑗 𝑝, the coefficient 𝑘=𝑗𝑝𝑆(𝑝,𝑘)𝑠(𝑘,𝑗) of 𝑥𝑗 is 𝛿𝑝𝑗.

Similar for the other part. □

10 Graph-related Problems

10.1 Matching

Definition 10.1 (Matching). Given a graph 𝐺 = (𝑉,𝐸), a subset 𝑀 𝐸 is a matching if no two edges in 𝑀 share a common endpoint.

A matching is said to be

10.2 Vertex Covering

Definition 10.2. A vertex subset 𝑈 of a graph 𝐺 = (𝑉,𝐸) is a vertex cover of 𝐺 if every edge of 𝐺 has an endpoint in 𝑈. A vertex cover is said to be minimum if there is no vertex cover with fewer vertices.

Proposition 10.1. Let 𝐺 be a graph. For any matching 𝑀 and vertex cover 𝐶 of 𝐺, we have |𝑀||𝐶|.

Proof. The vertex cover has to contain at least one endpoint of each edge in the matching. □

Corollary 10.1. Let 𝐺 be a graph. If a matching 𝑀 and a vertex cover 𝐶 of 𝐺 have the same cardinality, then 𝑀 is a maximum matching and 𝐶 is a minimum vertex covering.

Theorem 10.1 (Kőnig’s theorem [Kőnig1931]2 ). If 𝐺 is a bipartite graph, then the number of edges in a maximum matching equals the number of vertices in a minimum vertex covering.

Proof. TODO □

10.3 Systems of distinct representatives

Definition 10.3. Let 𝐴 = (𝐴1,𝐴2,,𝐴𝑛) be a family of nonempty subsets of a given set 𝑋. A system of distinct representatives (SDR) of 𝐴 is a family 𝑆 = (𝑎1,𝑎2,,𝑎𝑛) of distinct members, where 𝑎𝑖 𝐴𝑖 for all 𝑖.

It is easy to see that:

Proposition 10.2 (graph theoretical formulation). Let 𝐴 = (𝐴1,𝐴2,,𝐴𝑛) be a family of nonempty subsets. Construct a bipartite graph 𝐺 = (𝐿 𝑅,𝐸) with bipartite sets 𝐿 = 𝐴 and 𝑅 = 𝑖𝑛𝐴𝑖 and edge set 𝐸 = {(𝐴,𝑎): 𝐴 𝐿,𝑎 𝑅,𝑎 𝐴}. Then, 𝐴 has an SDR iff 𝐺 has a matching of size 𝑛 (i.e., an 𝐿-perfect matching).

Definition 10.4. Let 𝐴 = (𝐴1,𝐴2,,𝐴𝑛) be a family of nonempty subsets of a given set 𝑋. Then 𝐴 is said to satisfy the marriage condition (aka Hall’s condition) iff

| 𝑖𝐼𝐴𝑖| |𝐼|,𝐼 [𝑛].

It is obvious that the marriage condition is necessary for 𝐴 to have a system of distinct representatives. However, it turns out that this condition is also sufficient:

Theorem 10.2 (Hall [1935]). The theorem has two equivalent formulations:

graph theoretical. We only need to prove that if there does not exist an 𝐿-perfect matching then the Hall’s condition does not hold.

If the maximum matching has fewer than 𝑛 edges, then, by Kőnig’s Theorem 10.1, the minimum vertex covering also has fewer than 𝑛 nodes. Let 𝑈 be a minimum vertex covering. Let 𝐿1 = 𝐿 𝑈,𝑅1 = 𝑅 𝑈,𝐿2 = 𝐿 \𝐿1. |𝑈|= |𝐿1|+|𝑅1|< 𝑛.

Edges connected with 𝐿2 can only be covered by nodes in 𝑅1, implying that 𝑁(𝐿2)𝑅1. Since |𝑅1|= |𝑈||𝐿1|< 𝑛|𝐿1|= |𝐿||𝐿1|= |𝐿2|, we have |𝐿2|> |𝑁(𝐿2)|, violating the Hall’s conditions.

PIC
Figure 10: Proof

10.4 Stable Marriage

Definition 10.5. The stable matching problem is the problem of finding a stable matching between two equally sized sets of elements 𝐿 (men) and 𝑅 (women), where each element in one set has a total ordering about the other set.

A complete marriage (i.e., matching) is unstable if there exists two women 𝐴,𝐵 𝐿 and two men 𝑎,𝑏 𝑅 such that

In other words, there exist a pair of woman and man who both prefer each other to their current partner. Otherwise, the complete marriage is stable.

Theorem 10.3. A stable marriage always exists.

Algorithm 1 Gale and Shapley [1962]
Begin with all women marked as rejected, and no man rejected any woman.
0
If there is no rejected woman, then stop. Otherwise, go to Step 1.
1
Let each woman who is marked as rejected choose whom she ranks the highest among all men who have not yet rejected her.

Then go to Step 2.

2
Let each man pick whom he ranks the highest among all women who have chosen him and whom he has not yet rejected, defers to her decision, and reject the others.

Then go to Step 0.

Proof. [TODO: ]

References

   D. Gale and L. S. Shapley. College admissions and the stability of marriage. The American Mathematical Monthly, 69(1):9–15, 1962. doi: 10.1080/00029890.1962.11989827. URL https://doi.org/10.1080/00029890.1962.11989827.

   P. Hall. On representatives of subsets. Journal of the London Mathematical Society, s1-10(1):26–30, 1935. doi: https://doi.org/10.1112/jlms/s1-10.37.26. URL https://londmathsoc.onlinelibrary.wiley.com/doi/abs/10.1112/jlms/s1-10.37.26.

   D. Kőnig. Gráphok és mátrixok. Matematikai és Fizikai Lapok, 38, 1931. URL https://real-j.mtak.hu/7307/.

   T. Mansour and M. Schork. Commutation Relations, Normal Ordering, and Stirling Numbers. Chapman and Hall/CRC, 09 2015. doi: https://doi.org/10.1201/b18869.

   G. Szárnyas. Graphs and matrices: A translation of "graphok és matrixok" by dénes kőnig (1931), 2020. URL https://arxiv.org/abs/2009.03780.