11.6 The surreal numbers
In this section we consider another example of a higher inductive-inductive type, which draws together many of our threads: Conway’s field of surreal numbers [conway:onag]. The surreal numbers are the natural common generalization of the (Dedekind) real numbers (\autorefsec:dedekind-reals) and the ordinal numbers (\autorefsec:ordinals). Conway, working in classical mathematics with excluded middle and Choice, defines a surreal number to be a pair of sets of surreal numbers, written , such that every element of is strictly less than every element of . This obviously looks like an inductive definition, but there are three issues with regarding it as such.
Firstly, the definition requires the relation of (strict) inequality between surreals, so that relation must be defined simultaneously with the type of surreals. (Conway avoids this issue by first defining games, which are like surreals but omit the compatibility condition on and .) As with the relation for the Cauchy reals, this simultaneous definition could a priori be either inductive-inductive or inductive-recursive. We will choose to make it inductive-inductive, for the same reasons we made that choice for .
Moreover, we will define strict inequality and non-strict inequality for surreals separately (and mutually inductively). Conway defines in terms of , in a way which is sensible classically but not constructively. Furthermore, a negative definition of would make it unacceptable as a hypothesis of the constructor of a higher inductive type (see \autorefsec:strictly-positive).
Secondly, Conway says that and in should be “sets of surreal numbers”, but the naive meaning of this as a predicate is not positive, hence cannot be used as input to an inductive constructor. However, this would not be a good type-theoretic translation of what Conway means anyway, because in set theory the surreal numbers form a proper class, whereas the sets and are true (small) sets, not arbitrary subclasses of . In type theory, this means that will be defined relative to a universe , but will itself belong to the next higher universe , like the sets and of ordinals and cardinals, the cumulative hierarchy , or even the Dedekind reals in the absence of propositional resizing. We will then require the “sets” and of surreals to be -small, and so it is natural to represent them by families of surreals indexed by some -small type. (This is all exactly the same as what we did with the cumulative hierarchy in \autorefsec:cumulative-hierarchy.) That is, the constructor of surreals will have type
which is indeed strictly positive.
Finally, after giving the mutual definitions of and its ordering, Conway declares two surreal numbers and to be equal if and . This is naturally read as passing to a quotient of the set of “pre-surreals” by an equivalence relation. However, in the absence of the axiom of choice, such a quotient presents the same problem as the quotient in the usual construction of Cauchy reals: it will no longer be the case that a pair of families of surreals yield a new surreal , since we cannot necessarily “lift” and to families of pre-surreals. Of course, we can solve this problem in the same way we did for Cauchy reals, by using a higher inductive-inductive definition.
Definition 11.6.1.
The type of surreal numbers, along with the relations and , are defined higher inductive-inductively as follows. The type has the following constructors.
-
•
For any and functions and , whose values we write as and for and respectively, if , then there is a surreal number .
-
•
For any such that and , we have .
We will refer to the inputs of the first constructor as a cut. If is the surreal number constructed from a cut, then the notation will implicitly assume , and similarly will assume . In this way we can usually avoid naming the indexing types and , which is convenient when there are many different cuts under discussion. Following Conway, we call a left option of and a right option.
The path constructor implies that different cuts can define the same surreal number. Thus, it does not make sense to speak of the left or right options of an arbitrary surreal number , unless we also know that is defined by a particular cut. Thus in what follows we will say, for instance, “given a cut defining a surreal number ” in contrast to “given a surreal number ”.
The relation has the following constructors.
-
•
Given cuts defining two surreal numbers and , if for all , and for all , then .
-
•
Propositional truncation: for any , if , then .
And the relation has the following constructors.
-
•
Given cuts defining two surreal numbers and , if there is an such that , then .
-
•
Given cuts defining two surreal numbers and , if there is an such that , then .
-
•
Propositional truncation: for any , if , then .
We compare this with Conway’s definitions:
-
-
If are any two sets of numbers, and no member of is any member of , then there is a number . All numbers are constructed in this way.
-
-
iff (no and no ).
-
-
iff ( and ).
-
-
iff ( and ).
The inclusion of in the definition of is unnecessary if all objects are [surreal] numbers rather than “games”. Thus, Conway’s is just the negation of his , so that his condition for to be a surreal is the same as ours. Negating Conway’s and canceling double negations, we arrive at our definition of , and we can then reformulate his in terms of without negations.
We can immediately populate with many surreal numbers. Like Conway, we write
for the surreal number defined by a cut where and are families described by and . Of course, if or are , we leave the corresponding part of the notation empty. There is an unfortunate clash with the standard notation for subsets, but we will not use the latter in this section.
-
•
We define recursively by
That is, is defined by the cut consisting of and . Similarly, is defined by (picking out ) and .
-
•
Similarly, we define using the sign-case recursion principle (\autorefthm:sign-induction):
, . -
•
By a dyadic rational we mean a pair where and , and such that if then is odd. We will write it as , and identify it with the corresponding rational number. If denotes the set of dyadic rationals, we define by induction on :
Here we use the fact that if and is odd, then is a dyadic rational with a smaller denominator than .
- •
-
•
Recall the type of ordinals from \autorefsec:ordinals, which is well-ordered by the relation , where means that for some . We define by well-founded recursion (\autorefthm:wfrec) on :
It will also follow from the simplicity theorem that restricted to finite ordinals agrees with .
-
•
A few more interesting examples taken from Conway:
In identifying surreal numbers presented by different cuts, the following simple observation is useful.
Theorem 11.6.2 (Conway’s simplicity theorem).
Suppose and are surreal numbers defined by cuts, and that the following hold.
-
•
for all and .
-
•
For every left option of , there exists a left option with .
-
•
For every right option of , there exists a right option with .
Then .
Proof.
Applying the path constructor of , we must show and . The first entails showing for all , which we assumed, and for all . But by assumption, for any there is an with hence as desired. Thus ; the proof of is symmetric. ∎
In order to say much more about surreal numbers, however, we need their induction principle. The mutual induction principle for applies to three families of types:
As with the induction principle for Cauchy reals, it is helpful to think of and as families of relations between the types and . Thus we write as and as . Similarly, we usually omit the since it inhabits a mere proposition and so is uninteresting, and we may often omit and as well, writing simply or . With these notations, the hypotheses of the induction principle are the following.
-
•
For any cut defining a surreal number , together with
-
(a)
for each , an element , and
-
(b)
for each , an element , such that
-
(c)
for all and we have
there is a specified element . We call such data a dependent cut over the cut defining .
-
(a)
-
•
For any with and , if and and also and , then .
-
•
Given cuts defining two surreal numbers and , and dependent cuts over and over , such that for all we have and , and for all we have and , then .
-
•
takes values in mere propositions.
-
•
Given cuts defining two surreal numbers and , dependent cuts over and over , and an such that and , we have .
-
•
Given cuts defining two surreal numbers and , dependent cuts over and over , and an such that together with , we have .
-
•
takes values in mere propositions.
Under these hypotheses we deduce a function such that
(11.6.3) | ||||
In the computation rule (11.6.3) for the point constructor, is a surreal number defined by a cut, and denotes the dependent cut over defined by applying (and using the fact that takes to ). As usual, we will generally use pattern-matching notation, where the definition of on a cut may use the symbols and and the assumption that they form a dependent cut.
As with the Cauchy reals, we have special cases resulting from trivializing some of , , and . Taking and to be constant at , we have -induction, which for simplicity we state only for mere properties:
-
•
Given , if holds whenever is a surreal number defined by a cut such that and hold for all and , then holds for all .
This should be compared with Conway’s remark:
In general when we wish to establish a proposition for all numbers , we will prove it inductively by deducing from the truth of all the propositions and . We regard the phrase “all numbers are constructed in this way” as justifying the legitimacy of this procedure.
With -induction, we can prove
Theorem 11.6.3 (Conway’s Theorem 0).
-
1.
For any , we have .
-
2.
For any defined by a cut, we have and for all and .
Proof.
Note first that if , then whenever occurs as a left option of some cut , we have by the first constructor of , and similarly whenever occurs as a right option of a cut , we have by the second constructor of . In particular, 12.
We prove 1 by -induction on . Thus, assume is defined by a cut such that and for all and . But by our observation above, these assumptions imply and for all and , yielding by the constructor of . ∎
Corollary 11.6.4.
is a 0-type.
Proof.
By contrast, Conway’s Theorem 1 (transitivity of ) is somewhat harder to establish with our definition; see \autorefthm:NO-unstrict-transitive.
We will also need the joint recursion principle, -recursion, which it is convenient to state as follows. Suppose is a type equipped with relations and . Then we can define by doing the following.
-
1.
For any defined by a cut, assuming and to be defined such that for all and , we must define . (We call this the primary clause of the recursion.)
-
2.
Prove that is antisymmetric: if and , then .
-
3.
For defined by cuts such that for all and for all , and assuming inductively that for all , for all , and also that and for all and , we must prove .
-
4.
For defined by cuts and an such that , and assuming inductively that , and also that and for all and , we must prove .
-
5.
For defined by cuts and an such that , and assuming inductively that , and also that and for all and , we must prove .
The last three clauses can be more concisely described by saying we must prove that (as defined in the first clause) takes to and to . We will refer to these properties by saying that preserves inequalities. Moreover, in proving that preserves inequalities, we may assume the particular instance of or to be obtained from one of its constructors, and we may also use inductive hypotheses that preserves all inequalities appearing in the input to that constructor.
If we succeed at 1–5 above, then we obtain , which computes on cuts as specified by 1, and which preserves all inequalities:
Like -recursion for the Cauchy reals, this recursion principle is essential for defining functions on , since we cannot first define a function on “pre-surreals” and only later prove that it respects the notion of equality.
Example 11.6.5.
Let us define the negation function . We apply the joint recursion principle with , with , and . Clearly this is antisymmetric.
For the main clause in the definition, we assume defined by a cut, with and defined such that for all and . By definition, this means for all and , so we can define by the cut . This notation, which follows Conway, refers to the cut whose left options are indexed by the type indexing the right options of , and whose right options are indexed by the type indexing the left options of , with the corresponding families and defined by composing those for with negation.
We now have to verify that preserves inequalities.
-
•
For , we may assume for all and for all , and show . But inductively, we may assume and , which gives the desired result, by definition of , , and the constructor of .
-
•
For , in the first case when it arises from some , we may inductively assume , in which case follows by the constructor of .
-
•
Similarly, if arises from , the inductive hypothesis is , yielding again.
To do much more than this, however, we will need to characterize the relations and more explicitly, as we did for the Cauchy reals in \autorefthm:RC-sim-characterization. Also as there, we will have to simultaneously prove a couple of essential properties of these relations, in order for the induction to go through.
Theorem 11.6.6.
There are relations and such that if and are surreals defined by cuts, then
Moreover, we have
(11.6.6) |
and all the reasonable transitivity properties making and into a “bimodule” over and :
(11.6.7) |
Proof.
We define and by double -induction on . The first induction is a simple recursion, whose codomain is the subset of consisting of pairs of predicates of which one implies the other and which satisfy “transitivity on the right”, i.e. (11.6.6) and the right column of (11.6.7) with and replaced by the two given predicates. As in the proof of \autorefdefn:RC-approx, we regard these predicates as half of binary relations, writing them as and , with denoting the pair of relations. We equip with the following two relations:
Note that is antisymmetric, since if and , then and for all , hence by univalence for mere propositions and function extensionality. Moreover, to say that a function preserves inequalities is exactly to say that, when regarded as a pair of binary relations on , it satisfies “transitivity on the left” (the left column of (11.6.7)).
Now for the primary clause of the recursion, we assume given defined by a cut, and relations , , , and for all and , of which the strict ones imply the non-strict ones, which satisfy transitivity on the right, and such that
(11.6.9) |
We now have to define and for all . Here in contrast to \autorefdefn:RC-approx, rather than a nested recursion, we use a nested induction, in order to be able to inductively use transitivity on the left with respect to the inequalities and . Define by taking to be the subset of consisting of two mere propositions, denoted and (with ), such that
(11.6.10) | |||
(11.6.10) | |||
(11.6.10) |
Using notation analogous to and , we equip with the two relations defined for and by
Again, is evidently antisymmetric in the appropriate sense. Moreover, a function which preserves inequalities is precisely a pair of predicates of which one implies the other, which satisfy transitivity on the right, and transitivity on the left with respect to the inequalities and . Thus, this inner induction will provide what we need to complete the primary clause of the outer recursion.
For the primary clause of the inner induction, we assume also given defined by a cut, and properties , , , and for all and , with the strict ones implying the non-strict ones, transitivity on the left with respect to and , and on the right with respect to . We can now give the definitions specified in the theorem statement:
(11.6.10) | ||||
(11.6.10) |
For this to define an element of , we must show first that . The assumption has two cases. On one hand, if there is with , then by transitivity on the right with respect to , we have for all . Moreover, by transitivity on the left with respect to , we have for any , hence by transitivity on the right. Thus, .
On the other hand, if there is with , then by transitivity on the left with respect to we have for all . And by transitivity on the left and right with respect to and , we have for any . Thus, .
We also need to show that these definitions are transitive on the left with respect to and . But if , then for all by definition; while if , then also by definition.
Thus, (11.6.10) and (11.6.10) do define an element of . We now have to verify that this definition preserves inequalities, as a dependent function into , i.e. that these relations are transitive on the right. Remember that in each case, we may assume inductively that they are transitive on the right with respect to all inequalities arising in the inequality constructor.
-
•
Suppose and , the latter arising from and for all and . Then the inductive hypothesis (of the inner recursion) applied to yields for any . Moreover, by definition implies that for any , so by the inductive hypothesis of the outer recursion we have . Thus, .
-
•
Suppose and . First, suppose arises from . Then the inner inductive hypothesis applied to yields , hence .
Second, suppose arises from . Then by definition, implies , and then the inner inductive hypothesis for yields .
-
•
Suppose and , the latter arising from and for all and . By definition, implies there merely exists with or with . If , then the outer inductive hypothesis yields , hence . If , then the inner inductive hypothesis for (which holds by the constructor of ) yields .
This completes the inner induction. Thus, for any defined by a cut, we have and defined by (11.6.10) and (11.6.10), and transitive on the right.
To complete the outer recursion, we need to verify these definitions are transitive on the left. After a -induction on , we end up with three cases that are essentially identical to those just described above for transitivity on the right. Hence, we omit them. ∎
Theorem 11.6.10.
For any we have and .
Proof.
From left to right, we use -induction where , with and supplying the relations and . In all the constructor cases, and are defined by cuts, so the definitions of and evaluate, and the inductive hypotheses apply.
From right to left, we use -induction to assume that and are defined by cuts. But now the definitions of and , and the inductive hypotheses, supply exactly the data required for the relevant constructors of and . ∎
Corollary 11.6.11.
The relations and on satisfy
and are transitive:
As with the Cauchy reals, the joint -recursion principle remains essential when defining all operations on .
Example 11.6.12.
We define by a double recursion. For the outer recursion, we take the codomain to be the subset of consisting of functions such that and for all . For such we define and . Clearly is antisymmetric.
For the primary clause of the recursion, we suppose defined by a cut, and we define by an inner recursion on with codomain , with relations and coinciding with and . For the primary clause of the inner recursion, we suppose also defined by a cut, and give Conway’s definition:
In other words, the left options of are all numbers of the form for some left option , or for some left option . Now we verify that this definition preserves inequality:
-
•
If arises from knowing that and for all and , then the inner inductive hypothesis gives and , while the outer inductive hypotheses give and . And since each is by definition a left option of , we have , and similarly . Thus, using transitivity, and , and so we may conclude by the constructor of .
-
•
If arises from an with , then inductively , hence since is a right option of .
-
•
Similarly, if arises from , then since .
This completes the inner recursion. For the outer recursion, we have to verify that preserves inequality on the left as well. After an -induction, this proceeds in exactly the same way.
In the Appendix to Part Zero of [conway:onag], Conway discusses how the surreal numbers may be formalized in ZFC set theory: by iterating along the ordinals and passing to sets of representatives of lowest rank for each equivalence class, or by representing numbers with “sign-expansions”. He then remarks that
The curiously complicated nature of these constructions tells us more about the nature of formalizations within ZF than about our system of numbers…
and goes on to advocate for a general theory of “permissible kinds of construction” which should include
-
1.
Objects may be created from earlier objects in any reasonably constructive fashion.
-
2.
Equality among the created objects can be any desired equivalence relation.
Condition 1 can be naturally read as justifying general principles of inductive definition, such as those presented in \autorefsec:strictly-positive,\autorefsec:generalizations. In particular, the condition of strict positivity for constructors can be regarded as a formalization of what it means to be “reasonably constructive”. Condition 2 then suggests we should extend this to higher inductive definitions of all sorts, in which we can impose path constructors making objects equal in any reasonable way. For instance, in the next paragraph Conway says:
…we could also, for instance, freely create a new object and call it the ordered pair of and . We could also create an ordered pair different from but co-existing with it…If instead we wanted to make into an unordered pair, we could define equality by means of the equivalence relation if and only if or .
The freedom to introduce new objects with new names, generated by certain forms of constructors, is precisely what we have in the theory of inductive definitions. Just as with our two copies of the natural numbers and in \autorefsec:appetizer-univalence, if we wrote down an identical definition to the cartesian product type , we would obtain a distinct product type whose canonical elements we could freely write as . And we could make one of these a type of unordered pairs by adding a suitable path constructor.
To be sure, Conway’s point was not to complain about ZF in particular, but to argue against all foundational theories at once:
…this proposal is not of any particular theory as an alternative to ZF… What is proposed is instead that we give ourselves the freedom to create arbitrary mathematical theories of these kinds, but prove a metatheorem which ensures once and for all that any such theory could be formalized in terms of any of the standard foundational theories.
One might respond that, in fact, univalent foundations is not one of the “standard foundational theories” which Conway had in mind, but rather the metatheory in which we may express our ability to create new theories, and about which we may prove Conway’s metatheorem. For instance, the surreal numbers are one of the “mathematical theories” Conway has in mind, and we have seen that they can be constructed and justified inside univalent foundations. Similarly, Conway remarked earlier that
…set theory would be such a theory, sets being constructed from earlier ones by processes corresponding to the usual axioms, and the equality relation being that of having the same members.
This description closely matches the higher-inductive construction of the cumulative hierarchy of set theory in \autorefsec:cumulative-hierarchy. Conway’s metatheorem would then correspond to the fact we have referred to several times that we can construct a model of univalent foundations inside ZFC (which is outside the scope of this book).
However, univalent foundations is so rich and powerful in its own right that it would be foolish to relegate it to only a metatheory in which to construct set-like theories. We have seen that even at the level of sets (0-types), the higher inductive types in univalent foundations yield direct constructions of objects by their universal properties (\autorefsec:free-algebras), such as a constructive theory of Cauchy completion (\autorefsec:cauchy-reals). But most importantly, the potential to model homotopy theory and category theory directly in the foundational system (\autorefcha:homotopy,\autorefcha:category-theory) gives univalent foundations an advantage which no set-theoretic foundation can match.
Title | 11.6 The surreal numbers |
---|---|
\metatable |