context | $ f:C\to \mathcal {\mathcal P}(C) $ |
definiendum | $ x\in D_f $ |
inclusion | $ D_f \subseteq C $ |
postulate | $ x \notin f(x) $ |
We take an arbitrary set $C$ and argue about all the functions $f:C\to \mathcal {\mathcal P}(C)$ from $C$ to the powerset ${\mathcal P}(C)$. For any such $f$, we define $ D_f$ as the subset of $C$ containing the elements $x\in C$ for which $ x \notin f(x) $.
So for example we might consider
$C=\{0,1\}$
so that
${\mathcal P}(C)=\{\{\},\{0\},\{1\},\{0,1\}\}$
and if we were to consider the function $f$ that maps $0$ to $f(0)=\{0,1\}$ and $1$ to $f(1)=\{0\}$, then as $0\in \{0,1\}$ but $1\notin \{0\}$, we are left with $D_f=\{1\}$.
The point of $ D_f \subseteq C $ defined as above is this: Given any set $C$, if one attempted to use its elements to index its subsets, i.e. via a function $f:C\to \mathcal P(C)$, then the “flipped diagonal subset” $ D_f \in {\mathcal P}$ will always be found to be missed.
Proof of the above and also proof of Cantor's theorem: For all $x,X$, we eighter have $x\in X$ or $x\notin X$. If $X=Y$, then $x\in X$ has the same truth value as $x\in Y$. So for all $x,X,Y$, we have
$(X=Y) \Rightarrow \neg(x\in X\land x\notin Y),$
which is the same as
$(x\in X\land x\notin Y) \Rightarrow \neg(X=Y).$
This holds always. Now, specifically, for any $x\in D_f$ and using $X=D_f$ and $Y=f(x)$, the left hand side reads $x\in D_f\land x\notin f(x)$ (which by definition of $D_f$ is the same as just $x\in D_f$) and the right hand side reads $\neg(f(x)=D_f)$. The same happens for $x\in C$ but not in $D_f$ and where we then switch $X$ and $Y$. This means $\nexists x\ (f(x)=D_f)$. Since $D_f\subseteq C$, i.e. $D_f \in \mathcal P(C) = \mathrm{codom} (f)$, we see that no such $f$ is a surjection, let alone a bijection. So the cardinality of any set is less than that of its power set. $\Box$
If, in a similar spirit, $C,f$ are taken to be a sequence and an enumeration of sequences, then we see that there is at least once sequence which escapes enumeration. This further translates to the uncountability of real numbers.
Notice the occurence of a negation of a formula in which $x$ appears twice in the comprehension part. This sort of set comprehension is typical for this sort of business.
A set is smaller than another one, if you can embed the first into the second but not the other way around. This is more formally defined using injective functions.
If $ {\mathbb N} $ denotes the natural numbers (let’s rather write $ \omega_0 $ for it here) and $ {\mathbb R} $ denotes the real numbers, then you can find an injection from the former set into the latter. This means there is a function $ i : \omega_0 \to {\mathbb R} $ so that for any distinct two $ n,m \in \omega_0 $ you have $ i(n) \neq i(m) $. For example $ i(n) := e^n $ does the job.
However, Cantor proves that you can’t find such an injective function in $ {\mathbb R} \to \omega_0 $. So the set of reals is bigger than the set of natural numbers.
Btw., note that the space of functions $ \omega_0 \to {\mathbb R} $ can be represented internally to a category of sets. I.e. there are representations of this collection of functions from $\omega_0$ to $ {\mathbb R} $ as a set, and this set (up to bijection) is written $ {\mathbb R}^{\omega_0} $. If $ \{0,1\} $ is some set with only two elements, Cantor also shows that even $\{0,1\}^{\omega_0}$ is bigger than $\omega_0$
Given that we have a notion of bigger now, call $\omega_1 $ the next big infinite set after the naturals and $\omega_2$ the set after that one.
Let’s now consider a large category of sets (or classes, or model, or however you want to call those collections) which obeys the laws of our standard theory of sets (ZFC). There is an object in this category which you denote by $\{0,1\}^{\omega_0}$, but what this thing actually contains contains depends on how you stuff it. The only rule is that it’s content doesn’t violate the laws of ZFC, but this still leaves some freedom. It’s now possible to find a category where the sets called $\{0,1\}^{\omega_0}$ and $\omega_1$ are in bijection. (By the standard construction of the real numbers from $\{0,1\}^{\omega_0}$, this also means the reals are the next big things after the naturals.)
BUT in the 60’s the mathematician Cohen showed you can define a category of sets that simultaneously fulfills the laws of ZFC, but where the sets are such that it’s possible to find an injective function from $\omega_2$ to $\{0,1\}^{\omega_0}$! This means taking $\omega_0$ to $\{0,1\}^{\omega_0}$ jumps in size over a whole kind of infinity, namely $\omega_1$. Since the category obeyed ZFC, then if ZFC is a consistent theory, this implies that you can’t use the formal laws of the theory ZFC to prove $\{0,1\}^{\omega_0}$ and $\omega_1$ are in bijection. We say that the statement is independent of this theory of sets.
Cohen invented a new technique to construct the above injection $\omega_2 \to \{0,1\}^{\omega_0}$. It’s called „forcing“ and a since then big deal in logic. Basically, he first notes that $\omega_2 \to \{0,1\}^{\omega_0}$ is in bijection to a characteristic function $\omega_2 \times {\omega_0} \to \{0,1\} $ He uses ZFC to construct the big set of functions $ \{ f : X \to \{0,1\} \} $ where X’s are finite subsets of $\omega_2 \times {\omega_0} $. Next comes some order theory, and „filters“ where patches the f’s together and ends up with one huge function $ F $, so that for different $ \alpha, \beta \in \omega_2 $, the functions $ n \mapsto F(\alpha, n) $ and $ n \mapsto F(\beta, n) $ are pairwise distinct. This then does the trick.
The continuum hypothesis being independent is somewhat ugly, I’d say. It implies, amongst other things, that from ZFC you can’t prove that given a big and a small set B and S, you cannot even prove that the number of subsets of B are more than the number of subsets of S. So even if your axioms of a theory of sets are in-your-face strong, like ZFC is, it’s hard to capture what you’d like to be true for sets.
The continuum hypothesis $\nexists Y.\,|\mathbb N|<|Y|<|\mathbb R|$ is independent of ZFC. Now, in fact, there are models where the continuum hypotsesis fails and where the $Y$ has $\mathcal P(Y)\cong\mathbb R$! The fact that of course also $\mathcal P(\mathbb N)\cong\mathbb R$ implies that ZFC (if consistent) doesn't prove “If a set $Y$ is bigger than another set $X$, then it has more subsets”. (Martin's axiom describes such a world where there is an $Y$ which is of bigger cardinality than the set of natural numbers, but not significalntly different anyway.)
Wikipedia: Cantor's theorem, Martin's axiom