Inclusion-Exclusion Principle (Pt. 1)

The Inclusion-Exclusion principle is one of those things I understood quickly and intuitively; in fact, it seemed obvious to me. It was not until I tried to prove it that I was amazed by its dependence on the binomial theorem and also realized it was not so obvious after all. In this post, I chose to take an unorthodox path and present the theorem at the end of the post rather than at the start.

Suppose you wanted to count the number of positive integers less than or equal to \(60\) not divisible by \(3\), how do you do it? Well, you include all \(60\) positive integers then subtract multiples of \(3\). There are \(\frac{60}{3}=20\) multiples of \(3\) less than or equal to \(60\). You're left with the required count \(60-20=40\). The 'Inclusion-Exclusion' refers to this process of including the \(60\) and then excluding (subtracting) the necessary numbers.

Now, suppose we include a second restriction: not divisible by \(5\). The idea is almost the same; Include all \(60\) positive integers, subtract the \(20\) multiples of \(3\) and the \(\frac{60}{5}=12\) multiples of \(5\). But wait, multiples of \(3\) and \(5\) (\(15, 30, 45, \text{ and } 60\)) were subtracted twice. Since we need them subtracted only once, we add them back to compensate, i.e., add back \(\frac{60}{3 \cdot 5} = 4\) multiples of \(3 \cdot 5=15\). The final answer becomes \(60 - 20 - 12 + 4 = 32\).

Finally, suppose we include the third restriction: not divisible by \(2\). Similarly, we include \(60\), subtract \(30\) even numbers, \(20\) multiples of \(3\), and \(12\) multiples of \(5\). Then add \(10\) multiples of \(2 \cdot 3=6\), \(6\) multiples of \(2 \cdot 5=10\), and \(4\) multiples of \(3 \cdot 5=15\). While this compensates for those number that multiples of only \(6\), or \(10\), or \(15\), it does not properly account for multiples of \(2 \cdot 3 \cdot 5=30\). Here's why. Multiples of \(30\) were subtracted \(3\) times: first as multiples of \(2\), of \(3\), and then of \(5\). Then they were added back \(3\) times: first as multiples of \(6\), of \(10\), and then of \(15\). Since we need them subtracted only once, we need to subtract the \(2\) multiples of \(30\) one last time. This gives \(60 - 30 - 20 - 12 + 10 + 6 + 4 - 2 = 16\).

This also works for general finite sets. If you had a universal set \(U\) and three subsets \(S_1, S_2, S_3\), then the number of elements in \(U\) not in any of these subsets will be: $$ |U| - \sum_{i=1}^3 |S_i| + \sum_{1\le i<j\le3}|S_i \cap S_j| - |S_1 \cap S_2 \cap S_3| $$

Now we must prove that the process of adding and subtraction always produces the required results for any given subsets \(S_i\), \(1 \le i \le n\). All elements of \(U\) can be partitioned into sets \(C_i = \{ x \in U : x \text{ is in } i \text{ subsets}\}\).

Case \(C_0\): elements in \(C_0\) are counted once when we add \(|U|\). This count is not altered again since subsequent calculations involve subsets \(S_i\) and no element of \(C_0\) is contained in any of these subsets.

Case \(C_a\): elements in \(C_a\) are counted once when we add \(|U|\). They are then subtracted \(\binom{a}{1}\) times when we subtract \(\sum_{i=1}^n |S_i|\); then added back \(\binom{a}{2}\) times when we added \(\displaystyle \sum_{1\le i<j\le n}|S_i \cap S_j|\). This goes on: subtracted \(\binom{a}{3}\) times, added \(\binom{a}{4}\) times, subtracted \(\binom{a}{5}\) times, added \(\binom{a}{6}\) times, etc.

So overall, the elements are counted \(1 - \binom{a}{1} + \binom{a}{2} - \binom{a}{3} + \cdots = \sum_{i=0}^a \binom{a}{i} (-1)^i(1)^{a-i} = (-1+1)^a = 0\) times.

The last step is an application of the binomial theorem: \((x+y)^n = \sum_{i=0}^n \binom{n}{i} x^i y^{n-i}\), setting \(n=a\), \(x=-1\), and \(y=1\).

Now this means that after applying the inclusion-exclusion principle, only the elements in \(C_0\), i.e., the elements in none of the given subsets \(S_i\), were counted.

Inclusion-Exclusion Principle. If \(U\) is the universal set and \({S_i : 1\le i \le n}\) the collection of subsets, then the number of elements in \(U\) not in any \(S_i\) is given by $$ |U| - \sum |S_i| + \sum |S_i \cap S_j| - \sum |S_i \cap S_j \cap S_k| + \cdots + (-1)^n |S_1 \cap \cdots \cap S_n| $$