11.5 Compactness of the interval


We already pointed out that our constructions of reals are entirely compatible with classical logic. Thus, by assuming the law of excluded middleMathworldPlanetmathPlanetmath (LABEL:eq:lem) and the axiom of choiceMathworldPlanetmath (LABEL:eq:ac) we could develop classical analysisMathworldPlanetmath, which would essentially amount to copying any standard book on analysis.

Nevertheless, anyone interested in computation, for example a numerical analyst, ought to be curious about developing analysis in a computationally meaningful setting. That analysis in a constructive setting is even possible was demonstrated by [Bishop1967]. As a sample of the differencesPlanetmathPlanetmath and similaritiesMathworldPlanetmath between classical and constructive analysis we shall briefly discuss just one topic—compactness of the closed intervalMathworldPlanetmathPlanetmath [0,1] and a couple of theoremsMathworldPlanetmath surrounding the concept.

Compactness is no exception to the common phenomenon in constructive mathematics that classically equivalentMathworldPlanetmathPlanetmathPlanetmathPlanetmathPlanetmath notions bifurcate. The three most frequently used notions of compactness are:

  1. 1.

    metrically compactPlanetmathPlanetmath: “Cauchy completePlanetmathPlanetmathPlanetmathPlanetmathPlanetmath and totally boundedPlanetmathPlanetmath”,

  2. 2.

    Bolzano–Weierstraß compact: “every sequenceMathworldPlanetmath has a convergentMathworldPlanetmathPlanetmath subsequenceMathworldPlanetmath”,

  3. 3.

    Heine-Borel compact: “every open cover has a finite subcover”.

These are all equivalent in classical mathematics. Let us see how they fare in homotopy type theory. We can use either the Dedekind or the Cauchy reals, so we shall denote the reals just as . We first recall several basic definitions.

Definition 11.5.1.

A metric space (M,d) is a set M with a map d:M×MR satisfying, for all x,y,z:M,

d(x,y) 0, d(x,y) =d(y,x),
d(x,y) =0x=y, d(x,z) d(x,y)+d(y,z).
Definition 11.5.2.

A Cauchy approximation in M is a sequence x:Q+M satisfying

(δ,ϵ).d(xδ,xϵ)<δ+ϵ.

The limit of a Cauchy approximation x:Q+M is a point :M satisfying

(ϵ,θ:+).d(xϵ,)<ϵ+θ.

A complete metric space is one in which every Cauchy approximation has a limit.

Definition 11.5.3.

For a positive rational ϵ, an ϵ-net in a metric space (M,d) is an element of

(n:)(x1,,xn:M)(y:M).(kn).d(xk,y)<ϵ.

In words, this is a finite sequence of points x1,,xn such that every point in M merely is within ϵ of some xk.

A metric space (M,d) is totally bounded when it has ϵ-nets of all sizes:

(ϵ:+)(n:)(x1,,xn:M)(y:M).(kn).d(xk,y)<ϵ.
Remark 11.5.4.

In the definition of total boundedness we used sloppy notation (n:N)(x1,,xn:M). Formally, we should have written (x:List(M)) instead, where List(M) is the inductive type of finite lists from \autorefsec:bool-nat. However, that would make the rest of the statement a bit more cumbersome to express.

Note that in the definition of total boundedness we require pure existence of an ϵ-net, not mere existence. This way we obtain a function which assigns to each ϵ:+ a specific ϵ-net. Such a function might be called a “modulusMathworldPlanetmathPlanetmathPlanetmath of total boundedness”. In general, when porting classical metric notions to homotopy type theory, we should use propositional truncationMathworldPlanetmath sparingly, typically so that we avoid asking for a non-constant map from to or . For instance, here is the “correct” definition of uniform continuity.

Definition 11.5.5.

A map f:MR on a metric space is uniformly continuous when

(ϵ:+)(δ:+)(x,y:M).d(x,y)<δ|f(x)-f(y)|<ϵ.

In particular, a uniformly continuous map has a modulus of uniform continuity, which is a function that assigns to each ϵ a corresponding δ.

Let us show that [0,1] is compact in the first sense.

Theorem 11.5.6.

The closed interval [0,1] is complete and totally bounded.

Proof.

Given ϵ:+, there is n: such that 2/k<ϵ, so we may take the ϵ-net xi=i/k for i=0,,k-1. This is an ϵ-net because, for every y:[0,1] there merely exists i such that 0i<k and (i-1)/k<y<(i+1)/k, and so |y-xi|<2/k<ϵ.

For completeness of [0,1], consider a Cauchy approximation x:+[0,1] and let be its limit in . Since max and min are Lipschitz maps, the retractionMathworldPlanetmath r:[0,1] defined by r(x):max(0,min(1,x)) commutes with limits of Cauchy approximations, therefore

r()=r(limx)=lim(rx)=r(limx)=,

which means that 01, as required. ∎

We thus have at least one good notion of compactness in homotopy type theory. Unfortunately, it is limited to metric spaces because total boundedness is a metric notion. We shall consider the other two notions shortly, but first we prove that a uniformly continuous map on a totally bounded space has a supremum, i.e. an upper boundMathworldPlanetmath which is less than or equal to all other upper bounds.

Theorem 11.5.7.

A uniformly continuous map f:MR on a totally bounded metric space (M,d) has a supremum m:R. For every ϵ:Q+ there exists u:M such that |m-f(u)|<ϵ.

Proof.

Let h:++ be the modulus of uniform continuity of f. We define an approximation x:+ as follows: for any ϵ: total boundedness of M gives a h(ϵ)-net y0,,yn. Define

xϵ:max(f(y0),,f(yn)).

We claim that x is a Cauchy approximation. Consider any ϵ,η:, so that

xϵmax(f(y0),,f(yn))andxηmax(f(z0),,f(zm))

for some h(ϵ)-net y0,,yn and h(η)-net z0,,zm. Every zi is merely h(ϵ)-close to some yj, therefore |f(zi)-f(yj)|<ϵ, from which we may conclude that

f(zi)<ϵ+f(yj)ϵ+xϵ,

therefore xη<ϵ+xϵ. Symmetrically we obtain xη<η+xη, therefore |xη-xϵ|<η+ϵ.

We claim that m:limx is the supremum of f. To prove that f(x)m for all x:M it suffices to show ¬(m<f(x)). So suppose to the contrary that m<f(x). There is ϵ:+ such that m+ϵ<f(x). But now merely for some yi participating in the definition of xϵ we get |f(x)-f(yi)<ϵ, therefore m<f(x)-ϵ<f(yi)m, a contradictionMathworldPlanetmathPlanetmath.

We finish the proof by showing that m satisfies the second part of the theorem, because it is then automatically a least upper bound. Given any ϵ:+, on one hand |m-f(xϵ/2)|<3ϵ/4, and on the other |f(xϵ/2)-f(yi)|<ϵ/4 merely for some yi participating in the definition of xϵ/2, therefore by taking u:yi we obtain |m-f(u)|<ϵ by triangle inequalityMathworldMathworldPlanetmathPlanetmath. ∎

Now, if in \autorefctb-uniformly-continuous-sup we also knew that M were complete, we could hope to weaken the assumptionPlanetmathPlanetmath of uniform continuity to continuity, and strengthen the conclusionMathworldPlanetmath to existence of a point at which the supremum is attained. The usual proofs of these improvements rely on the facts that in a complete totally bounded space

  1. 1.

    continuity implies uniform continuity, and

  2. 2.

    every sequence has a convergent subsequence.

The first statement follows easily from Heine-Borel compactness, and the second is just Bolzano–Weierstraß compactness. Unfortunately, these are both somewhat problematic. Let us first show that Bolzano–Weierstraß compactness implies an instance of excluded middle known as the limited principle of omniscience: for every α:𝟐,

(n:α(n)=1𝟐)+(n:α(n)=0𝟐). (11.5.8)

Computationally speaking, we would not expect this principle to hold, because it asks us to decide whether infinitely many values of a function are 0𝟐.

Theorem 11.5.9.

Bolzano–Weierstraß compactness of [0,1] implies the limited principle of omniscience.

Proof.

Given any α:𝟐, define the sequence x:[0,1] by

xn:{0if α(k)=0𝟐 for all k<n,1if α(k)=1𝟐 for some k<n.

If the Bolzano–Weierstraß property holds, there exists a strictly increasing f: such that xf is a Cauchy sequenceMathworldPlanetmathPlanetmath. For a sufficiently large n: the n-th term xf(n) is within 1/6 of its limit. Either xf(n)<2/3 or xf(n)>1/3. If xf(n)<2/3 then xn convergesPlanetmathPlanetmath to 0 and so (n:)α(n)=0𝟐. If xf(n)>1/3 then xf(n)=1, therefore (n:)α(n)=1𝟐. ∎

While we might not mourn Bolzano–Weierstraß compactness too much, it seems harder to live without Heine–Borel compactness, as attested by the fact that both classical mathematics and Brouwer’s Intuitionism accepted it. As we do not want to wade too deeply into general topology, we shall work with basic open sets. In the case of these are the open intervals with rational endpoints. A family of such intervals, indexed by a type I, would be a map

:I\setof(q,r):×|q<r,

with the idea that a pair of rationals (q,r) with q<r determines the type \setofx:|q<x<r. It is slightly more convenient to allow degenerate intervals as well, so we take a family of basic intervals to be a map

:I×.

To be quite precise, a family is a dependent pair (I,), not just . A finite family of basic intervals is one indexed by \setofm:|m<n for some n:. We usually present it by a finite list [(q0,r0),,(qn-1,rn-1)]. Finally, a finite subfamily of (I,) is given by a list of indices [i1,,in] which then determine the finite family [(i1),,(in)].

As long as we are aware of the distinction between a pair (q,r) and the corresponding interval \setofx:|q<x<r, we may safely use the same notation (q,r) for both. IntersectionsMathworldPlanetmath and inclusions of intervals are expressible in terms of their endpoints:

(q,r)(s,t)  :(max(q,s),min(r,t)),
(q,r)(s,t)  :(q<rsq<rt).

We say that (I,λi.(qi,ri)) (pointwise) covers [a,b] when

(x:[a,b]).(i:I).qi<x<ri. (11.5.10)

The Heine-Borel compactness for [0,1] states that every covering family of [0,1] merely has a finite subfamily which still covers [0,1].

Theorem 11.5.11.

If excluded middle holds then [0,1] is Heine-Borel compact.

Proof.

Assume for the purpose of reaching a contradiction that a family (I,λi.(ai,bi)) covers [0,1] but no finite subfamily does. We construct a sequence of closed intervals [qn,rn] which are nested, their sizes shrink to 0, and none of them is covered by a finite subfamily of (I,λi.(ai,bi)).

We set [q0,r0]:[0,1]. Assuming [qn,rn] has been constructed, let s:(2qn+rn)/3 and t:(qn+2rn)/3. Both [qn,t] and [s,rn] are covered by (I,λi.(ai,bi)), but they cannot both have a finite subcover, or else so would [qn,rn]. Either [qn,t] has a finite subcover or it does not. If it does we set [qn+1,rn+1]:[s,rn], otherwise we set [qn+1,rn+1]:[qn,t].

The sequences q0,q1, and r0,r1, are both Cauchy and they converge to a point x:[0,1] which is contained in every [qn,rn]. There merely exists i:I such that ai<x<bi. Because the sizes of the intervals [qn,rn] shrink to zero, there is n: such that ai<qnxrn<bi, but this means that [qn,rn] is covered by a single interval (ai,bi), while at the same time it has no finite subcover. A contradiction. ∎

Without excluded middle, or a pinch of Brouwerian Intuitionism, we seem to be stuck. Nevertheless, Heine-Borel compactness of [0,1] can be recovered in a constructive setting, in a fashion that is still compatible with classical mathematics! For this to be done, we need to revisit the notion of cover. The trouble with (11.5.10) is that the truncated existential allows a space to be covered in any haphazard way, and so computationally speaking, we stand no chance of merely extracting a finite subcover. By removing the truncation we get

(x:[0,1])(i:I)qi<x<ri, (11.5.12)

which might help, were it not too demanding of covers. With this definition we could not even show that (0,3) and (2,5) cover [1,4] because that would amount to exhibiting a non-constant map [1,4]𝟐, see \autorefex:reals-non-constant-into-Z. Here we can take a lesson from “pointfree topology” (i.e. locale theory): the notion of cover ought to be expressed in terms of open sets, without reference to points. Such a “holistic” view of space will then allow us to analyze the notion of cover, and we shall be able to recover Heine-Borel compactness. Locale theory uses power setsMathworldPlanetmath, which we could obtain by assuming propositional resizing; but instead we can steal ideas from the predicative cousin of locale theory, which is called “formal topology”.

Suppose that we have a family (I,) and an interval (a,b). How might we express the fact that (a,b) is covered by the family, without referring to points? Here is one: if (a,b) equals some (i) then it is covered by the family. And another one: if (a,b) is covered by some other family (J,𝒢), and in turn each 𝒢(j) is covered by (I,), then (a,b) is covered (I,). Notice that we are listing rules which can be used to deduce that (I,) covers (a,b). We should find sufficiently good rules and turn them into an inductive definition.

Definition 11.5.13.

The inductive cover is a mere relationMathworldPlanetmath

:(×)(I:𝒰(I×))𝖯𝗋𝗈𝗉

defined inductively by the following rules, where q,r,s,t are rational numbers and (I,F), (J,G) are families of basic intervals:

  1. 1.

    reflexivityMathworldPlanetmath: (i)(I,) for all i:I,

  2. 2.

    transitivity: if (q,r)(J,𝒢) and (j:J).𝒢(j)(I,) then (q,r)(I,),

  3. 3.

    monotonicity: if (q,r)(s,t) and (s,t)(I,) then (q,r)(I,),

  4. 4.

    localization: if (q,r)(I,) then (q,r)(s,t)(I,λi.((i)(s,t))).

  5. 5.

    if q<s<t<r then (q,r)[(q,t),(r,s)],

  6. 6.

    (q,r)(\setof(s,t):×|q<s<t<r,λu.u).

The definition should be read as a higher-inductive type in which the listed rules are point constructors, and the type is (-1)-truncated. The first four clauses are of a general nature and should be intuitively clear. The last two clauses are specific to the real line: one says that an interval may be covered by two intervals if they overlap, while the other one says that an interval may be covered from within. Incidentally, if rq then (q,r) is covered by the empty family by the last clause.

Inductive covers enjoy the Heine-Borel property, the proof of which requires a lemma.

Lemma 11.5.14.

Suppose q<s<t<r and (q,r)(I,F). Then there merely exists a finite subfamily of (I,F) which inductively covers (s,t).

Proof.

We prove the statement by inductionMathworldPlanetmath on (q,r)(I,). There are six cases:

  1. 1.

    Reflexivity: if (q,r)=(i) then by monotonicity (s,t) is covered by the finite subfamily [(i)].

  2. 2.

    Transitivity: suppose (q,r)(J,𝒢) and (j:J).𝒢(j)(I,). By the inductive hypothesis there merely exists [𝒢(j1),,𝒢(jn)] which covers (s,t). Again by the inductive hypothesis, each of 𝒢(jk) is covered by a finite subfamily of (I,), and we can collect these into a finite subfamily which covers (s,t).

  3. 3.

    Monotonicity: if (q,r)(u,v) and (u,v)(I,) then we may apply the inductive hypothesis to (u,v)(I,) because u<s<t<v.

  4. 4.

    Localization: suppose (q,r)(I,) and (q,r)=(q,r)(a,b). Because q<s<t<r, by the inductive hypothesis there is a finite subcover [(i1),,(in)] of (s,t). We also know that a<s<t<b, therefore (s,t)=(s,t)(a,b) is covered by [(i1)(a,b),,(in)(a,b)], which is a finite subfamily of (I,λi.((i)(a,b))).

  5. 5.

    If (q,r)[(q,v),(u,r)] for some q<u<v<r then by monotonicity (s,t)[(q,v),(u,r)].

  6. 6.

    Finally, (s,t)(\setof(u,v):×|q<u<v<r,λz.z) by reflexivity. ∎

Say that (I,) inductively covers [a,b] when there merely exists ϵ:+ such that (a-ϵ,b+ϵ)(I,).

Corollary 11.5.15.

A closed interval is Heine-Borel compact for inductive covers.

Proof.

Suppose [a,b] is inductively covered by (I,), so there merely is ϵ:+ such that (a-ϵ,b+ϵ)(I,). By \autorefreals-formal-topology-locally-compact there is a finite subcover of (a-ϵ/2,b+ϵ/2), which is therefore a finite subcover of [a,b]. ∎

Experience from formal topology shows that the rules for inductive covers are sufficient for a constructive developmentMathworldPlanetmath of pointfree topology. But we can also provide our own evidence that they are a reasonable notion.

Theorem 11.5.16.

  1. 1.

    An inductive cover is also a pointwise cover.

  2. 2.

    Assuming excluded middle, a pointwise cover is also an inductive cover.

Proof.

  1. 1.

    Consider a family of basic intervals (I,), where we write (qi,ri):(i), an interval (a,b) inductively covered by (I,), and x such that a<x<b. We prove by induction on (a,b)(I,) that there merely exists i:I such that qi<x<ri. Most cases are pretty obvious, so we show just two. If (a,b)(I,) by reflexivity, then there merely is some i:I such that (a,b)=(qi,ri) and so qi<x<ri. If (a,b)(I,) by transitivity via (J,λj.(sj,tj)) then by the inductive hypothesis there merely is j:J such that sj<x<tj, and then since (sj,tj)(I,) again by the inductive hypothesis there merely exists i:I such that qi<x<ri. Other cases are just as exciting.

  2. 2.

    Suppose (I,λi.(qi,ri)) pointwise covers (a,b). By \autorefdefn:inductive-cover-interval-2 of \autorefdefn:inductive-cover it suffices to show that (I,λi.(qi,ri)) inductively covers (c,d) whenever a<c<d<b, so consider such c and d. By \autorefclassical-Heine-Borel there is a finite subfamily [i1,,in] which already pointwise covers [c,d], and hence (c,d). Let ϵ:+ be a Lebesgue number for (qi1,ri1),,(qin,rin) as in \autorefex:finite-cover-lebesgue-number. There is a positive k: such that 2(d-c)/k<min(1,ϵ). For 0ik let

    ck:((k-i)c+id)/k.

    The intervals (c0,c2), (c1,c3), …, (ck-2,ck) inductively cover (c,d) by repeated use of transitivity and \autorefdefn:inductive-cover-interval-1 in \autorefdefn:inductive-cover. Because their widths are below ϵ each of them is contained in some (qi,ri), and we may use transitivity and monotonicity to conclude that (I,λi.(qi,ri)) inductively cover (c,d). ∎

The upshot of the previous theorem is that, as far as classical mathematics is concerned, there is no difference between a pointwise and an inductive cover. In particular, since it is consistent to assume excluded middle in homotopy type theory, we cannot exhibit an inductive cover which fails to be a pointwise cover. Or to put it in a different way, the difference between pointwise and inductive covers is not what they cover but in the proofs that they cover.

We could write another book by going on like this, but let us stop here and hope that we have provided ample justification for the claim that analysis can be developed in homotopy type theory. The curious reader should consult \autorefex:mean-value-theorem for constructive versions of the mean value theorem.

Title 11.5 Compactness of the interval
\metatable